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EXECUTIVE SUMMARY 



The current report describes the results of an interim evaluation of 
selected aspects of the Follow Through program. Congress authorized Follow 
Through in 1967 under an amendment to the Economic Opportunity Act to pro- 
vide comprehensive health, social, and educational services foz^ "poor, chil- 
dren in primary grades who had experienced Head Start or an equivalent 
preschool program. The enabling legislation anticipated a large-scale 
service program, bu : appropriations did not match this vision. Accord- 
ingly^ soon after its creation. Follow Through became a socio-educat ional 
experiment , employing educational Innovators to act as sponsors of their 
own intervention programs in different school distz^icts throughout the 
United States. This concept of different educational improvement models 
being tried in various situations was called "planned variation." 

The evaluation of Follow Through is the evaluation of the effective- 
ness of the sponsored educational models as they are implemented in various 
school districts. School districts are recommended for pdrt ioipat ion by 
state education officials and are awarded grants by the U.S. Office of 
Education. School communities choose a model from among those offered by 
sponsors. These sponsored educational pj^t)grams represent the only dis- 
tinct part of the experimental treatment. Parent advisory committees and 
nutrition, medical, dental, and social service components must be present 
in every Follow Through program, but they are not specified by type. 
Evaluation of the Follow Through program consists primarily of determining 
which approaches are effective in achieving a specified set of educj:it ional 
objectives for children and a variety of changes in parent- school relations. 

The SRI evaluation of the impact and effectiveness of Follow Through, 
both as an overall program and as a collection of diverse "treatments" with 
vary,ing goals and emphases, has been approached at a number of levels. In 
part, the evaluation wa/s designed to answer policy relevant questions, 
such as the following: 

ft Are any approaches having positive impact on children, parents, 
school, and community? 

o VfhXch approaches appear most ef f ect ive and under what condi- 
tions? 

«* 
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another level, the evaluation seeks to discover in what ways and to 
what extent planned variations in approaches arc occLirring. At still 
another level, the evaluation seeks to develop useful data and to advar'^o 
the state of the art regarding research on large-scale, nonexpei'inienta 1 
social intervention programs, such as Follow Through. 

Design of the Evaluation 

The basic design for longitudinal evaluation of Follow Through, on \ 
wJiich this interim report is based, is summarized as follows: 

(1) A set of projects that have had at least one year's experi- 
ence with a sponsored Follow Through approach are sampled 
for participation in the national evaluation. This sam- 
pling process is based on criteria suc)i as participation 

in Head Start planned variation, ethnic or minority group 
representation, representation of different sponsor 
approaches, and regional and community characteristics. 

(2) For each school participating in the Follow Through experi- 
ment, a comparable school in the same district that is not 
receiving a Follow Through grant is recruited to serve as 
the non-Follow Through comparison group. A Follow Through 
(FT) school, or group of classrooms, operating in accor- ' 
dance with a sponsor's **model" and the non-Follow Through 
(NFT) comparison classrooms define a Follow Through project, 

.(3) Within each project , five 'categories of mea iirements' are 
obtained: pupil classroom dejnogr aphies ; cognitive and 
noncognitive pupil measures; parent interviews; teacher 
responses to questionnaires; and project and commvmity 
descriptors. An additional category of measures — classroom 
observation processes--are collected on a limited number 
of Follow Through and non-Follow Through classrooms. 

(4) The original SRI evaluation plan called for collection of 
all maior categories of measures during the beginning 
period for each annual group of participants--or cohort-- 
and-at specified successive time points, general!}^ at rhe 
end of each grade year . 

Due to administrative difficulties, collection of ba3elinc measure- 
ments for Cohort I ssjnples was not completed unti 1 December 1969, creating 
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serious analysis problems for evaluation of program impacts on this co- 
hort. Cohort II ;neasures, however, were gathered well within the intended 
baseline interval (second to fourth week following commencement of school) 
Parent interviews were to be limited to two times --once in the initial 
year and again at the end of Third Grvade. Since the Follow Through ex- 
periment provided for four years of "treatment" for kindergarten cohorts 
(three years for children entering at the first grade), and since there 
were to be four successive cohf)ris , a total of 16 evaluation points 
existed in this plan. Subsequent modifications required reducing the 
size of the evaluation samples that were included in intermediate testing, 
although the total plan includes 16 evaluation points. 

This interim report is based on a limited set of two annual cohorts — 
one which has progressed two years through the four- (or three-) year 
program (Cohort I), anci one which has progressed only one year (Cohort II) 
In terms of the 16 cell design, this report is based on evidence from only 
three cells, as shown in the tabulation of school year progression of 
Follow Through Cohorts by grade stream, which follows: 



Cohort 



Grade Stream 



Experience Year in Schoo 1 



First 



Second 



Third 



Fourth 



Cohort I 


Kindergarten 


1969- 


70 




1970- 


71 


1971- 


72 


1972- 


73 


(Enter 


First Grade 


1969- 


70 




1970- 


71 


1971- 


72 






Fall 1969) 




















Cohort II 


Kindergarten 


1970- 


71 


1971- 


72 


1972- 


73 


1973- 


74 


(ICnter 


First Grade 


1970- 


71 


1971- 


72 


1972- 


73 






Fall 1970) 




















Cohort III 


Kindergarten 


1971- 


72 


1972- 


73 


1973- 


74 


1974- 


75 


(Enter 


First Grade 


1971- 


72 


1972- 


73 


1973- 


74 






Fall 1971) 




















Cohort IV 


Kindergarten 


1972- 


73 


1973- 


74 


1974- 


75 


1975- 


76 


(Enter 


First Grade 


1972- 


73 


1973- 


74 


1974- 


75 







Fall 1972) 



Periods and groups covered by analyses in this report. 
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The evidence of program impact was developed from systematic organi 
zationof baseline measurements (taken at entrance to the pzogram) and 
progress measurement s (t aken at the end of each school year) into out- 
come , process, and control variables. Three classes of outcome measvu^es 
or evaluation foci were generated: child, parent, and teacher. 



Measures of program impacts on children were: 

(1) Total achievement --the raw score sum of all correct 
responses on all cognitive test items. 

(2) YRAT achi evement --the raw score sum of correct responses 
to the Wide Range Achievement Test. 

(3) Quant itat ive ski lis - -the raw score sum of correct re- 
s ponses to items pertaining to quantitative concepts 
such as numeration, operations (addition , subtraction, 
etc.), and word problems. 

(4) Reading skills --the raw score sum of correct responses 
to items requiring reading or reading-related skills 
(including pre-reading). Such skills as a Iphabet /letter 
recognition, matching and copying, figure copying, word 
matching, symbol matching, and oddity discrimination. 

(5) Language Arts --the raw score su.n of correct responses 
to items requiring language, lexicographic, or gram- 
matic'Sl skills siich as analogies, word meaning, spelling, 
and concept activation. 

(6) Cognitive Procosses --a residual category consisting ol the 
raw score sum of correct responses to items requiring 
perceptual motor skill? and concept identifications. . 

(7) Affect — the scaled sum of the child ' s answers to ques- 
tions about how he felt toward school, learning, himself, 
friends , etc . - 

(8) Attendance — the number of days absent reported for the 
preceding school year. 

Measures of program impacts on parents were: 

(1) Parent /fehl Id interactions — the exteijt to which parents 
report that they actively interact with their children 
in such activities .as talking with their children t aking 
their * children on trips ^ helping their children with 
school work, reading to them, accepting assistartte from 
them, and acknowledging cheir progress in school. 
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(2) Parent/school involvement "-the extent to which parents 
report that they are actively participating in various 
school- related activities, such as qlassroom visits , 
volunteer assistance, parent /school meetings, and ex- 
ternal contacts with school personnel. 

(3) ChjLld-academic expectation — the extent to which the parent 
reports satis fact ion with child/ s pi ogress and optimism 
regarding the child's future, both academic and nonacademic 
(e.g. , what are the child ' s expected grades , chances ot 
getting & good job, chances of going on to college?). 

(4) Sense of control — the extent to which the parent reports 
a sense of concern and control over school procedures, 
educational reforms, and school awareness of and respon- 
siveness to parent andv-<coTnmunity desires and needs. 

Measures of teacher level program impacts were:^ 

(1) Parent-educator image — the extent to which teachers . reported 
they felt it essential to "get together with parents outside 
of the classroom" for purposes of . ^ 

• Improving children's learning 

• Improving classroom teaching 

• Learning parents' views on teaching , 

• Improving school services to parents 

• Improving school services to children 

• Improving school services to community 

• Parental understanding of school program. 

(2) Professional acceptance of method — the extent to which the 
teacher reports she would not prefer to adopt some teaching 
approach other th^n the one she is^currently using. 

Data obtained from classroom observation procedures were organized 
and factor analyzed, yielding the following five classroom process scales 

(1) Self-regulatory — the extent to which children work indepen- 
dently on activities not strictly academic 

(2) Child-initiated interact ions--the tent to which children 
initiate interactions- and receive ^.^sitive or negative 
feedback from adults. 
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(3) Programmed acacJemic — the extent to which adults teach 
small groups of children by highly structured question- 
response-reinf or cement interactions, . ^ 

(4) Expressive — the extent to which positive and negative ^ 
'[ affect was expressed by both children and adults, 

(5) Child self-learning — the extent to which ch,lldren work 
alone with books or seat-work materials.* 

Hypothefcos regarding program impacts on" each of these evaluation 
foci were formulated at several levels: overall and b.y individual 
approaches in terms of duration of treatment (one year or two years) and 
in terms of successive cohort experiences (C-I. or C-II). Classrooms 
definod the units of analysis for assessment of effects and hypotheses 
tests, and classroom scores were composed of the scores of only those 
pupils for whom both pre- and post-measurement data were available. 
Parent data from classroom grouped pupils were similarly grouped. Where 
necessary, certain missing values were imputed from school and project 
raexins,. 

Four basic analysis groups were created, corresponding to cohorts 
and entrance points within jcohorts. These groups^ are Cohort I-K (kinder- 
garteners entering FT in Fall, 1969), Cohort I-EF (first graders, in 
schools without kindergarten, entering FT in Fall, 1969), Cohort II-K 
(kindergarteners entering FT in Fall, 1970), and Cohort II-EF (first 
Kradors in schools without kindergarten entering Fl' in Fall, 1970), " 
C Cohort I data were further organized into oneryear effects (1969-1970) 
ot)d two-year effects (1969-1971) subsets. - 

The basic statistical procedure for analysis olf program effects was 
fixed effects one-way analysis of covariance (ANCOV/i), with planned*' vari- 
ations defining the treatment" variable. Separate but parallel ANCOVAS 
were performed on project level and sponsor level treatment groupings « 
These analyses were conducted separately oh each datii grouping (cohort 
and grade stream) and for each set of outcbmes (pupil, parent, and 
teacher). Individual project results were obtained by means of planned 
comparisons (linear contrasts) of corresponding FT with NFT subgroups. 



Summary of Significant Program Impacts 

Significant FT-favoring results of the analyses conducted on these ■ 
interim data are summarized separately for each sponsor. That is, in 
this summary, only the significant (p<,05). results in favor of the Follow 
Through group are reported. The complete results ^ as presented in the 
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text, are Car too complex and extensive to report adequately in this 
aummary. We have concentrated on presenting FT- favoring findings be- 
cause we assumed they would be of principal interest. 

The results of pupil outcomes are reported separately for the analy- 
sis of two-year and one-year data. Since parent impacts are measured 
during the first year of the child's participation in the program and 
since te?chei impacts are the results of the most recent teacher survey, 
these results are not .'Aunmar i zed separately by cohorts. 

The Far West Model (FW) : The Responsive Educational Progra m 

Seven project samples were included in the analysis of interim effect 
for the Far West Laboratory approach. Analyses of two-year data show 
significant FT-favoring pupil differences on the quantitative skills 
measure; analyses of one-year data show FT-favoring pupil differences on 
the cognitive processes measure. Significant parent impacts were noted 
on the parent /child and parent/school interaction measures. No signifi- 
cant FT-favoring teacher impacts were noted. 

The University of Arizona Model (UA) ; The Tucson Early Education 
Model 

Five project samples were included in the analyses of interim effects 
for the University of Arizona approach.. Analyses of one-year data show 
FT-favoring pupil differences bn-the affect measure only. Significant 
parent impacts were noted on the parent/school involvement measure, and 
significant teacher impacts were noted on the acceptance of method measure 

Bank .Street Model (BC) : The Bank Street College of Education 
Approach ^ " 

Seven project samples were included in the analyses of interim effect 
for the Bank Street approach. Analyses of two-year data show significant 
FT-favoring pupil differences on the quantitative skills and cognitive 
process s measure. -Analyses of one-year data show FT-favoring pupil dif- 
ferences on overall achievement, on the WRAT measure, and on the reading 
and language/art s subscores. Significant parent impacts were noted on 
the parent/child interaction measure, and significant teacher impacts were 
noted on the acceptance of method measure. 
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The University of Georgia (UG): The Mathemagenic Activities 
Program 

A single project sample for the University of Georgia was included 
in the analyses. No FT-favoring significant differences were noted on 
outcome measures, but since this project was in its init,ial implementa- 
tion year (first year of affiliation) with the model, no evaluation con- 
clusions are appropriate. 

The University of Oregon Model (UO) ; The University of Oregon 
Engelmann/Becker Model for Direct Instruction 

Seven project samples were included in the analysis of interim 
effects of the University of Oregon approach. Analyses of two-year data 
show significant FT-favoring pupil differences on the attendance measure. 
Significant one-year effects were noted on the overall a hievement mea- 
sure, attendance measure, and the WRAT measure. Significant parent im- 
pacts were noted on the parent /child interaction measure, and significant 
teacher impacts were noted on the acceptance measure. Substantial analysi 
problems were encountered with these project data due to non-equivalence 
of treatment and comparison groups . 

The University of Kansas (UK); The Behavior Analysis Approach 

Three projects were included for analyses of interim effects from 
the University of Kansas approach. Analyses of one-year data show sig- 
nificant FT-favoring pupil differences on the achievement and WRAT mea- 
sures and on the quantitative and reading skills measures. No other FT- 
favoring differences reach significance for this model. Substantial 
analysis problems were encountered with these project data due to non- 
equiyalence of treatment and comparison groups. 

" High/Scope (HS) : The Cognitively Oriented Curriculum Model 

A total of three project samples were included ill the analyses of 
interim effects for this model. Analyses of two-year data show signifi- 
cant FT-favoring pupil differences on effect and attendance. Analyses 
of one-year data show FT-favoring differences on affect only. Significant 
parent impacts were noted in the parent/child, parent/school, and parent 
^expectation measures. No significant FT-favoring teacher impacts were 
noted. Substantial analysis problems were encountered with \these project 
data due to non-equivalence of treatment and comparison ^^roups . 
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University of Florida (UF) : The Florida Parent Education Model 



Five project samples were included in the interim evaluation of the 
University of Florida approach. Analyses of the two-year data show sig- 
nificant FT-favoring pupil differences only on attendance. Analyses of 
one-year data show FT-favoring pupil differences on the achievement mea- 
sure, the \VRAT measures, the affect measure, the quantitative skills 
measure, the reading skills measure, and the language arts measure. 
Significant parent impacts were noted on the parent/school interaction 
measure, and significant teacher impacts were noted on the acceptance of 
method measure. 

The EDC Model (ED): The EDC Open Education Program 

Four project samples were included in the interim evaluation of the 
EDC model . Analyses of two-year data show significant FT-favoring pupi 1 
differences on the quantitative skills measure. Analyses of one-year 
data show significant FT-favoring pupil differences on attendance and on 
cognitive processes. Significant parent impacts were noted on the parent/ 
school involvement measure, and significant teacher impacts were noted 
on the parent image measure. 

The NYU Model (NY): The Interdependent Learning Model 

Three project samples were included in the interim evaluation of the 
NYU model. Analyses of two-year data show significant FT-favoring pupil 
differences on attendance and on the quantitative skills measure. Sig- 
nificant FT-favoring one-year effect s failed to occur. Significant 
teacher impacts were noted on the acceptance of method. Significant 
parent impacts failed to occur in these projects. 

The Southwest Educational Development Model (SW) : Language 
Development (Bilingual) Approach 

A single project was included in the evaluation of the Southwest 
model. Analyses of^ the two-year data showed significant FT-favoring' 
pupil differences on the achievement measures, on the quantitative skills 
measure, and on the reading skills measure. Significant parent or teacher 
impacts failed to occur, although parents were significantly more satis- 
fied with their child ^ a progress. 
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Self-spon:: 



In addition to sponsored projects^ there ai-e six projects from the 
early group 5f 'pil()t"fe*^preceding the planned variation phase of Follow 
Through who elected to remain unsponsored (the only projects included in 
this interim evaluation which exercised this option). They are classified 
as self-sponsored or parent-implemented and have instituted progr'ams they 
themselves have developed. Analyses of two-j^ear data show FT-favoring 
pupil differences on the achievement and WRAT measures, on attendance, on 
quantitative skills, and on reading skills. Significant one-yeai- effects 
were observed on affect, on achievement, on quantitative skills, and on 
language arts. Significant parent impacts were noted on the parent/child 
interaction patterns and on parent/school involvement. Significant teacher 
impacts were noted on the acceptance of method measure. 

Again, the reader is cautioned that the above paragraphs summarize 
only the significant FT-favoring results. A more complete presentation of 
findings and their interpretations can be found in the text of this report. 

Process Indicators of Follow Through Treatments 

The five classroom process scales (factor scores) were qualitatively 
analyzed in conjunction with project impact data. These analyses tended 
to show (a) FT classroom activities do tend to correspond with sponsor 
emphases, (b) clear distinctions between FT and NFT classroom activities 
occur, and (c) patterns of activities (factor score profiles) are reason- 
ably consistent among projects employing the same models. These inter- 
pretations, however, are based only on qualitative analyses of process 
score profiles. 

More detailed and rigorous analyses conducted on che discrete vari- 
ables generated from the observation instruments displayed reliable oVer- 
all F'T'/NFT differences primarily on components related to the presence of 
several adults in the classroom. This result is important, since a fav\)r- 
able adult/child ratio is a necessary condition for the implementation of 
many critical features of the planned variations (or critical components\ 
of the treatments). Additional analyses showed, to some extent, predict- 
able rank ordering of the planned variations on many of the discrete 
observation variables.- 

This evidence, taki a together, suggests the following interpretations: 

(1) Sponsored approaches do differ discernibly from one another 
on many process variables. 
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(2) Processes characteristic of various Follow Through approaches 
predictably depart from characteristics observed in non- 
Follow Through classrooms on many process variables, 

(3) Analysis of factor scores and of discrete variable scores 
presents strong evidence of instructional activities and 
components that correspond well with . descript ions of in- 

ytended approaches,, thus validating in part the concept of 
planned variations in FT treatments. 

Overall Results 

Overall interim results were analyzed both in terms of average proj- 
ect results by grade stream within cohort' and in terms of percentage and 
frequency of FT-favoring outcomes in relation to the quality of comparison 
group match. Average project results are slightly in favor of Follow- 
Through for Cohort I-K, two-year pupil outcomes, and comparison of one- 
and two-year results show two-year effects as systematically greater. 
Cohort I-EF on the other hand, displays a slight NFT-favoring trend on the 
two-year pupil outcomes, and comparison of o^e- and two-year results shows 
second year deficits for FT. Cohort II average effects all tend to favor 
Follow Through, although the differences are greater for the entering 
first grade group than for the kindergarten group. 

With the exception of the child academic expectation and parent image 
measures results on parent and teacher measures tended to show positive FT 
impacts regardless of cohort. The image and expectation measures tended, 
to indicate negative impacts. Further investigation is needed to uncover 
reasons for these reversals. 

Analyses of the frequency and proportion of FT-favoring results in 
relation to the quality of the FT/NFT baseline match (good, moderate, or 
poor, based on seven pupil/parent indicators) show a strong relationship 
of outcomes to quality of match, particularly for Cohort-I data. Where 
FT and NFT were well matched, results tend to show FT-favoring results. 
Where the samples were poorly matched, results were generally NFT-favoring 
(primarily because the initial mismatch is strongly biased -agai^ns t the 
FT group). Further, comparison of Cohort I results with Cohort II show6 
program impact as systematically strong for the latter, suggesting a pro- 
gram maturation or improved implementation effect. 

When these interim results were reviewed within the perspective of 
xhe overall evaluation design, the likelihood of obtaining FT-favoring 
pupil, parent, or teacher results appears to be associated with several 
rather crucial evaluation parameters. In particular, the magnitude and 
frequency of FT-favoring pUpil results appears related to: 
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• The relative comparability of families in the FT and NFT samples 
within a project (quality of match). That is, as the quality of 
the match improves, the frequency and proportion of PT-favoring 
results also tend to improve* That bad matches tended to yield 
NFT-favoring results is primarily because the initial biases were 
extreme in favor of NFT, often suggesting that two separate popula- 
t ions were being compared . 

• The severity of impoverishment and disadvan tagement relative to 

the main-stream social structure. Projects in the most impoverished 
communities showed some of the most dramatic gains, but these were 
sometimes statistically unreliable and often confounded with com- 
parison group problems. This trend may indicate the presence of 
a type of floor effect, but more likely it is associated with major 
differences in the social complexities of rural and urban com- 
munities . 

• The amount of time the sponsor has had to refine and improve im- 
plementation of his treatment. In general, first-year impacts 
for 1970 samples (C-II) were stronger than for 1969 samples (C-I). 
Although this trend is confounded by certain measurement difficul- 
ties associated with the first-^year. Cohort I data, the differences 
appear large enough to support our interpretation. 

• The grade level of the pupils and the amount of time they spent 
in the program. This interpretation is sugges ted by the fairly 
regular cumulative trend observed for the Cohort I-K samples 
(second-year effects were almost always stronger than first-year 
effects). Also, the effects on Cohort II-EF samples (pupils in the 
first grade) tended to je larger than those on Cohort I I-K samples. 
These trends do not obtain for Cohort I-EF samples probably because 
the proportion of "good^^ matches in these samples was very low 
(i.e., 14 percent for Cohort I-E versus 50 percent for Cohort I-K). 

When the four trends evident at this interim point are combined, it 
appears that Follow Through has most often been successful in projects 
located in truly disadvantaged communities when there has been enough time 
to implement the mc^del properly. In addition, the effects appear cui, rela- 
tive , and impacts appear stronger at higher age levels. 

Some Caveats 

We wish to underscore the need for caution in generalizing the inter- 
pretations of the results we have detected to date. Some major reasons 
for this caution are as follows: 
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The samples on which these interim results are based are small, 
certainly too small to allow us to isolate approaches that "work" 
and approaches that do not . We can conclude that some changes 
are taking place, but we do not yet know precisely what they are 
or why they are occurring. At a more general level, the parent, 
teacher , cl ass room observa t ion and communi ty data indicate that 
Follow Through is succeeding in measurably altering adult attitudes 
and behaviors in the home, the school, and the community. Evidence 
that these changes in adults are having impact on the children 
is less marked and more variable, but results tend to indicate 
positive effects on TT pupils. It is likely that in future analyses 
^n larger and more representative samples , evidence of program 
impacts on pupil attitudes and achievements will be considerably 
more marked. 

Tn addition to the limitations imposed by the relatively small 
interim evaluation samples, we encountered complex problems of 
missing data. These resulted from high attrition and, particu- 
larly for Coiiort I, inadequate baseline data. The magnitude of 
these problems w'as greater than originally anticipated because 
of the unprecedented nature and scope of this research program. 
And, although we now know how to cope with them, they restrict 
our ability to generalize about findings for Cohort I samples, 
and to a lesser extent about findings for Cohort II samples. 

Since Follow Through is a quasi-exper iment , the allocation of 
treatments to projects and the a 1 loca t ion of units to treatment 
or control conditions within projects were nonrandom. One con- " 
sequence of this nonr andomness was that biases were introduced 
into the design. The bias associated with the allocation of 
treatments to projects may not be very serious. But the nonran- 
dominesG within projects (i.e. , systematic differences between FT 
and Nl T samples ) occas ional ly has serious consequences . Foi* 
example, in some projects, treatment and comparison groups were 
very d fferent. Although such differences are bound to occur in 
quasi-( xperiments for which control groups are assembled post hoc, 
they pi ?sent serious obstacles to the interpretation of outcomes. 
And where comparison group biases are severe , we suspect they 
invalidate the results of analyses for the projects affected. 

These problems (missing data, differences between comparison,' and 
treatment groups, and too few classrooms per project) combine to 
produce I'e lat i ve ly low statistical power in our analyses for ef- 
fects . To some extent this outcome was expected, since the U.S. 
Office of Education made a conscious decision to concentrate data* 
collectioi efforts at the entry grade (K or EF) and at the exit 
grade (3) nd to devote less effort at the intermediate grades. 
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Nevertheless, ue are quite likely failing to detect many 
important program impacts at this interim point. 

As suggested above, a substantial number of program impacts are 
evident in our analyses of interim data. Furthermore, we believe 
that the true magnitude of the effects is probably somewhat greater 
than detected by our analyses. But it is important to recognize 
that even if the number of significant effects were strikingly 
greater, we would still have diff icul ty interpreting how or why 
such results occurred because, at present, our current knowledge 
of the treatment is confined almost exclusively ttp th'^ sponsors' 
descriptions of them. We do have evidence from limited subsamples 
on some of the characteristics of some processes. This qualitative 
evidence indicates that cl ass room processes conform to these treat- 
ment descriptions. To interpret how and why results occur, we 
now need clear operational statements of what a sponsor does when 
he is installing and maintaining a .project and how he does it, 

Finally, because of the complexity and.^variety of the intervention 
approaches, or treatments, in the FT experiments, it. is very likely 
that many of the evaluation measures used were not uniformly ap- 
propriate, sensitive, or relevant to varied objectives. Many 
program objectives were probably overlooked in our assessments. 
The technology for evaluating large scale social programs is in 
its infancy. We believe that we have contributed substantially 
to the advancement of this technology through our successful and 
unsuccessful experiences with evaluation instruments and ^procedures . 
Yet there remains much more to be learned .< /■ 
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INTRODUCTION 



As originally authorized in the Economic Opportunity Act amendments 
of 1967 (P,L. 90-22, Section 222), Follow Through (FT) was a program of 
comprehensive services--including dental, medical, 'and nutritional ser- 
vices; an instructional program; and psychological counseling, all with 
parental and community participation--for disadvantaged children in the 
primary grades of schools throughout the nation. As part of the war on 
poverty. Follow Through was conceived as an extension of Head Start when 
that preschool program, by itself, did not seem to promote enduring de- 
velopmental gains (Wolff and Stein, 1966 ) . In contrast to the notion 
that intervention programs should begin v/ith 3till younger children (a 
notion that led to the development of the Parent Child Centers) , Follow 
T^lrough was based on the assumption that a sustained, multifaceted inter- 
vent ion that demands participation from the parents and community as well 
as the child,, throughout the child *s primary years, would contribute most 
to breaking the "cycle of poverty." 

Underlying all of the complementary programs (i,e.. Parent Child 
Center, Head Start, Follow Through, and other poverty and compensatory 
education programs) were some major theoretical shifts in view toward 
social services, One was the change from viewing poor persons and minor- 
ity persons as inferior individuals responsible for their own position 
to viewing tnem as victims of a system and blaming environmental factors , 
the subculture, or the" societal institutions for failing to provide equal 
opportunities' for success. Giving poor communities and minority groups 
more real power to control their own environment (by changing institutions 
such as the elementary school, the welfare departments, and the medical • 
profession) rather than giving direct charity to individuals was optimis- 
tically viewed as the solution to many social problems. While the Head 
Start and Follow Through programs still represent somewhat ambj^guous views 
toward the poor and minorities, the pervading philosophy does differ from 



See Hess' art icl e , (1969 ) on four different explanatory models for lower 
intellectual attainment by low-income and minority groups. See also 
S. Baratz and J> Baratz (1970) in which the authors argue that social 
•scientists have merely changed from blaming the children's inferior 
inheritance for their intellectual performance to blaming the children's 
inferior cultural milieu. ,. 
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old modes of social service and is aimed at preventing:, rather than 
remediating, social and economic problems. 

A . second theoretical notion on which hi^h expectations were based 
was that rep^arding the great plasticity and responsiveness to environ- 
mental stimulation of the human intellect in the early years of life 
(Hunt, 1961; Bloom, 1964). While this "critical period" hypothesis was 
no longer held in its strong form after a few years of experience with 
the Heac. Start and other preschool programs, thei^e was still , reason to 
believe (e.g., the studies of SUcels, 1966) that a sustained, enriched 
environment would bring lasting advantages--would allow children to obtain 
the basic skills and motivation needed to learn, to succeed in school, 
and then to obtain satisfying productive employment as adults and raise 
a new generation outside of the poverty mold. 



Follow Through as an Experiment 

Before the Follow Through program could be launched on a sdale com- 
parable to the Head Start program, which has now reached over 5 million 
preschoolers, events occurred that radically changed its form and its 
raison d'etre. Much less money was appropriated than was expected. It 
was decix'ed to use the period until more funds were made available to 
learn moi e about compensatory education by conceiving of Follow ThrQugh 
as a research an1rl development effort. The U.S. Office of Education (USOE) 
"sought ad /ice from the research community and found a number of educators 
willing to try out their methods or programs on a larger scale in actual 
school si: nations. 

Eveni ually the program, still funded at levels substantialj.y below 

original expec tat ions » was changed into an experiment for purposes of 

social policy guidance. 

The Office of Education, which administers Follow Through, 
prepared a menu of project-types from which applicants would 
select the one most suitable to their circumstances, and an 
evaluation plan that would use common measures to assess all 
projects (Timpane, 1970, p. 557). 

Individual decisions too numer^ou^ to mention were involved in the 
evolution of the final set of goals an\i.^valuation plan imposed on the 
Follow Through program. But several historical trends underlay the de- 
cisions to shape the Follow Through progranT" into a kind of large-scale 
social experiment. Most important among such trends were the following: 
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(IJ Disillusion with present understanding of social problem? and 

their cures made it imperative to find out more before invest- 
. . ing heavily. 

(2) Growing pressure for public accountability and knowledgeable 
program planning and policy-making in the government, as evi- 
denced by the installation of Program Planning Budgeting System 
(PPBS) in government departments and by Congressional mandates 
(e.g.. Title I of ESEA 1965, Section 402 of the Civil Rights 
Act of 1964), demanded that programs be evaluated. 

(3) Earlier piecemeal evaluations of educational changes (docu- 
mented by Hawkridge et'al., 1968, and Averch et al., 1971) 
and Head Start programs, where information on success or 
failure of individual centers was often confounded with the 
center's location, yielded no policy-relevant information 
and, thus, indicated that more comprehensive research was 
necessary. 

(4) Large-scale evaluations of program effectiveness and sugges- 
tions for such .other new concepts as experimental schools were 
being advanced by influential coirmissions and study groups 
(e.g., President's Science Advisory Committee, Progress Report 
of the Panel of Educational Research and Development, 1964). 

(5) The growing realization in Congress and among the public (Com- 
mittee for Economic Development, 1968) that directly applying 
great amounts of money (e.g., in Title I and Head Start) was 
not alleviating social ancl educational problems; the wisdom 

of allocating funds for another comprehensive proverty pro- 
: gram witjiout further knowledge was tnus made questionable. 

(6) Finally, several discernibly different and promising early 
education programs developed with government and founda:tion 
support were available and ready for widespread field testing. 

Although remaining a social ex])er iment , Fol low Through inevi t ably 
became oriented more toward education than community action, since re- 
sponsibility for the program was delegated to the U.S. Office of Education 
by the Office of Economic Opportunity and since the social services in- 
cluded ii\ the program^ were coordinated through the framework of the 
public^ school system. 

By the 1967-68 School Year, when USOEufunded 45 planning or pilot 
programs, the notion had already developed that Follow Through should be 
recast as a research and development program to refine methods of deliver- 
ing educational and supporting services to young children. Then, by the 



1968-69* School Year, the guidelines developed by the Office of Education 
came to emphasize national evaluation and specified a 'planned variation" 
approach, under which a number of different early childhood instructional 
programs would be implemented, each in a number of communities throughout 
the United States. Individuals and educational organizations involved 
in research and development on educational curricula were identified and 
were asked to present their instructional approaches to members of com- 
munities receiving Follow Through grants. The individuals and educational 
organizations were designated "program sponsors." Their intervention 
approaches were called "models." 

• ■ J 

\ 

Communities receiving Follow Through grants wer«' obliged to choose 
one of the available sponsors^ models. A sponsor anr* a school a±strict 
contracted to work together to implement Lu? i ^ tructionai or parent 
education approach and to integrate it Wxth other supporting services 
as part of their comprehensive Follow Through program. Variations in 
the educationaT components of the Follow Through program were "planned" 
variations only insofar as there was a limited number of sponsors to 
choose froni--l4 originally and 22 at present. The objectives of the 
evaluation changed along with the conception of the Follow Through pro- 
gram und eventuated in policy guidance objectives. 

Follow Through as Policy .Research 

As a social experiment for policy guidance, the Follow^ Through proj^ram 
built on the evaluation ^ effort s of the recent past and managed to set prec- 
edents for social experiments to follow. Continued funding for Follow 
Through as an experiment and for the evaluation effort indicated a wil- 
lingness on the part of legislators and administrators to defer judgment 
before proceeding to fund massive social action programs, since, the effects 
of such progratns cannot be accurately predicted. Although tremendous 
pressures remain to use resources for spreading services to all who have 
need and for satisfying the demands of qertain constituencies, there is 
at least a recognition that it may be wiser to test- the efficacy of pro-, 
grams aimed at mass behavioral change before applying them generally. 

Follow Through, although a compromise- between servLce and experi- 
mental purposes, is far less confused ^han the "act ion ''research'* projects 
of the late^l950s and early 1960s (e.g.. Mobilization for Youth, the Ford 
Foundation's "Grey Areas" project) in which researchers and program 



* ,i 

.Communities that iiad pilot projects in 1967-68 were allowed the choiTce 
of remaining self -sponsored , since they had a year of- program develop- 
ment on their own; . 



directors were often the same people* In these early projects, formative 
and summative functions for research were not distinguished, and conflicts 
"between service and research regarding changes in program goals were re- 
solved in favor of providing more service. These confusions permitted 
few reliable findings about projects and allowed no generalizations about ' 
progrim success in various settings* 

The FT experiment makes it theoretically possible to make discoveries 
not permitted by a survey of status design, such as the Equal Educational 
Opportunity Survey (Coleman et al., 1966) • The Coleman Report did not 
evaluate a particular program, but it has been recognized as having ad- 
vanced the state of the art of policy-relevant research. It not only 
measured the available school resources, or 'inputs' to the educational 
institutions, thought to be important to equality of opportunity but alsa, 
for the first time, surveyed the outputs C5if the schools — the performance 
of the students. The relationships discovered f rom these outputs were 
startling: 

• Schools are remarkably similar in ^.ixe way they relate to, the 

1^ achievement of their pupils when scoio-economic factors bear a 
strong relation to academic achievef^ient . When socio-economic 
factors are statistically controlleii.i however, it appears that 
differences between schools account i'or only a small fraction oi 
differences in pupil achievement (Coleman et al., 1966, p* 21). 

• School facilities for children of different races w^re not 
especially unequal, and where dif f ere/^ices did exist they were 
not necessarily in the presumed direction. In any event, it 
did not appear that school facilities had any great influence 
on educational achievement which seemesd mostly to derive from 
the family background of the child and the social class of his 
schooln^ates (Moynihan, 19i69), 

From the Coleman survey, however, we are neither able to estimate our 
confidence in causal inferences nor to obtain iiaformation about the effect 
of school programs on children over -a poriod of time. 

Sinc^a Follow Through is UJi intervention program, the evaluation need 
not be £l9ply a status survey but can be designed to assess changes in 
school programs; e.g., using a longitudinal evaluation design, it is pos- 
sible to take measurements before, during, and after several waves of 
children experience the program. 
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The Wosti/nghi use/Ohio University study of flead Start (Cicerelli 
et al,, 1969) J lil e the Coleman study, involved sampling on a national 
scale to take/intc account regional variations and, in addition, was an 
evaluation of j a program of intervention. It was a bold attempt to as- 
sess the oveWall irpact of' the Head Start program on the achievement and 
at t i t udes of / t he pr rticipating chi Idren . Because of the time schedule 
on which it ^was conlucted, pos t hoc evaluation design was employed. 
Insofar us possible subjects with an without the special intervention 
had to be "equated" on entering abilities at the one final measurement 
point. Inferences fbout effepts of the program over time had to be 
made on the basis of groups of children who had entered Read Start in 
different years. 

The Westinghouse study revealed some relatively small (rarely sta- 
tistically significan.) differences between Head Start and control groups 
-on the, attitude and achievement tests. However, its design did not permit 
much further inquiry ito explanations for this main finding; it especially 
did not provide cluct^ or program ii^provement In terms of policy, at best 
it might have aided de isions of the "go/no-go" t^^pe but could not provide 
guidance regarding ho " urograms might be improved. 

The "planned varl -^Lons" design for evaluating the Follow Through 
and the Head Start pro|,iams originated partly in response to the absence 
ill earlier evaluations ot information on the differential effectiveness 
of various educational approaches . Ideally, under this design, system- 
atically different strategies can be tested and compared so that more and 
less effective techniques can be cited for attaining various goals in sub- 
groups with varying characteristic!^.' When the "planned variations" idea 
was combined with the notion of measurement a' several points in time 
(before, during, and after primary school), on several successive waves 
of children, in the several special programs as well as in comparison 
school programs, the evaluation desigj», in conception, began to take on 
the a'^^pects of an experiment. 

Thus, when understo'»d in terms of its potential advancement over 
past efforts, the concept of the Follow Through experiment is quite sophis- 
ticated. As actualized, it demonstrates that the state of the art of im- 
plementing and evaluating large-scale social action progmms is just being 
developed. ; 
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It is important to understand that unlike Follow Through, the Head 

Start program is for the most part a one-summer or one-year experience 

so that measurement in the Westinghouse/Ohio study occurred for some 

chi^udren as long as 2 years after the end o± the intervention period, 

rather than during the intervention peviod. 
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stakeholder Interests in the Evaluation 



The lack of elegance in the FT evaluation desip^i is perfectly under- 
standable when the demands that circumstances made or the evaluation are 
considered. There is, first, the enormous scope and ven contradiction 
among the goals held for the Follow Through program b; people at many 
different levels. The goals range fro^ 'ong-term, abs ract , social goals, 
such as reducing poverty and racial disc imination , to immediate, concrete 
and specific goals, such as imp'^oving th ability of a .-hild to express 
himself verbally. The pressure for evali \tive information rega.^ding at- 
tainment of each of the objectives is gre t . Each is i oortant \o a group 
of people on whom the program depends for Its existence. \ 

' • ' \ 

Members of Congress and administrator, in the Executive Officje of the 
President want to know if the Follow Throug:' program over; 11 enhai]ces the 
"life chances of children" or makes poverty families more "se If -Sj/if f icient 
Their decisions on continued si.pport for the comprehensive servic^es and 
on allocation of f mds seem to require infornation about i a^ erage /per pupil 
costs, general participant satisfaction, and benefits derived by/ children 
and their families participating in programs supported und^ r the Follow 
Through authorization. Such information is needed yearly, because appro- 
priations for the Economic Opportunity Act are authorized annually. These 
stakeholders will find this document of some use. 

State and Federal administrators want to know which educational pro- 
grams work best with disadvantaged children and can be implemenied in a 
variety of settings. Both also want to know the comparative costs of the 
programs. While the two groups may vie over the authority to determine 
allocations and to make the decisions, they both want the information as 
soon as possible to select programs that "work." A recent Federal-State 
"5-year plan" for ai sseminating the most promising Follow Through program 
models to local education ■ agenc ies increases the pressure for iii^ .^rma tion 
on the effectiveness of "ready-made" program alternatives. It is policy- 
makers at the State and Feoeral levels to whom this report is primarily 
directed- ' 

Local education officials and local service agencies have goals in 
mind that dictate different foci for evaluation. They want to know which 
program will work for their particular population of children and how to 
implement it. Those local people actually involved in implementing Follow 
Through models in communities througnout the country have still other'' 
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These goals are stated in the Economic Opportunity Act. 
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concerns. Evaluation for their :»wn ' formative purposes might be appre- 
ciated, but this is rarely offered. Delayed summative assessment of their 
progress would more likely reflect the effects of satisfactorily imple- 
mented programs and thus woald be aesired over immediate assessment by 
Federally sponsored evaluators. tince the data in the present volume are 
analyzed in a manner appropriate to the broader policy questions, the 
local policv -maker likely will find that the present report does not 
serve his pu -eposes. 

The goaJ s of the teachers in the Follow Through classrooms and parents 
of participating children a-^e more specific to their particular groups of 
ch ildren and » 'ven to individual children . An evaluation designed to an- 
'swer some of tneir questions would be entirely dif.ferent from an evaluation 
aimed at broadez' policy questions. 

Finally, s^ponsors who are working to implement their ideas in the 
natural .laboratory of the public school have somewhr't different objectives. 
They have several entirely different theories of education and very dif- 
ferent notions about the appropriate agents of intervention and their 
roles (parents making curriculum decisions, teachers becoming experi- 
menters dr technicians, teachers becoming staff planners, parents rein- 
forcing school objectives) . Most* are also interested in experimentation 
as a way of testing hypotheses about intervention techniques and about 
children's learning from which better education theory could be built. 
Many are themselves engaged in formative evaluatiiDn as an aid in refining 
their methods. Some would like analysis and documentation of implementation 
procedures, descriptions of problems involved in working simultaneously 
witli a school district, a group of parents. Federal pro^,ram officers, and 
their own staffs to get a Follow Through program to children ^Ln school. 
Unfortunately, none of these purposes is well served by the present 
policy study. 

The concept of the evaluation as assessing alternatives may seem 
straightforward, but.it obscures fundamental value differences that sep- 
arate those with various interests in Follow Through. These basic dif- 
ferences reside in the question that is implied, but not answered, by 
the assertion that Follow Through is a comparative study of alternative 
approaches; the unasked and unanswered question is "approach to what?" 
Some feel that Follow Through should be used as a vehicle by which the 
educational system itself may be changed in basic ways to be more adapt- 
able to the diverse needs and desires of the children it serves and the 
adults who comprise its political constituency . A more common view is 
that the fundamental purpose of an educational system, including Follow 
Through, is to bring about desirable changes in people. 
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Although these two views are not nece osarily incompatible, t : ey imply 
that different kinds of c 'Iteria be used tu judge the effectiveness of tiie 
program. For example, if one holds that the essential purpose of Follow 
Through is to change the system, then the indicators of program success 
that are given most weight will be ones tnat attend to the behavior and 
beliefs of parents vis-a-^vis tne school and the actions and attitudes 
of administrators and teachers who are charged with the school's operation. 
On the jther hand, for those who view the school as an institution to 
bring about desired changes in children, the criteria of effectiveness 
or program success will center primarily on tlie changes that children 
display. This simplistic distinction still overlooks additional inportanl 
considerations about when changes might realistically be expected and what 
constitute "desirable changes" in children. Changes may range from grow- 
ing effectiveness in the use of bucn cognitive tools as reading and mas- 
tery of quantitative concepts to growth in psychological and social di- 
mensions such as increased self-esteem, self-confidence, or social 
sensitivity. 

Although it is clear that the general purpose of the evaluation of 
Follow Through is to assess the effectiveness of alternative approaches, 

there is far from unanimity of opinion regarding the particular goal_s 

that the' approaches should seek. Thus, a fundamental issue in the eval- 
uation design has been from the outset how to accommodate to the multi- 
plicity of criteria by which program effectiveness is judged. The de- 
cision to select a faix'ly bi^oad set of measurable behaviors against which 
to measure every program makes it possible to compare programs on that 
set of behaviors. What is relinquished is the ability to determine, for 
each sponsored model, if it accomplished its own aims. 

Unrealistic expectations (e.g., measurable changes in "self- 
sufficiency" of poverty families attributable to short-term partici- 
pation in a school-based Follow Through program) , contradictory expecta- 
tions (e.g., immediate feedback versus summative pre-post Follow Through 
evaluation), and changing expectations (e.g., finding improved ways of 
educating disadvantaged children versus finding out if Follow Through, 
on the average, improves disadvantaged children's education) for the 
program made the selection of the most appropriate obje^ctives for the 
evaluation problem.a !:ic . 

Problems of Design 

Besides the ptoblem of priorities posed to evaluation design by 
disparate goals, there is the paucity of measurement technology in the 
entire area of social action evaluation. The underdeveloped state of 



11 



I 

measures of personal growth and development in children is already wcl 1 
documented* Even less well explored are techniques for obtaining* infoi*- 
iT.ation on the extent or quality of program implementation, institutional 
responsiveness, or community ch.ange. Certainly techniques for* measuring 
educational product and soc i a r"crtJ^s t/benef i t analyses are still totally 
inadequate. Finally, the pui-ely Aogistical demands of an evaluation of 
a program the size of Follow Thr<?)Ugh are prodigious* 

The quasi -experimental fo rm the evaluat ion design has assumed resul t s 
from administrative decisions made in implementing the FT program. Some 
of these practical constraints have been mentioned already. Each of* 
160 school districts in various regions of the country has its unicjue 
group of community officials, j^arents, school principals, and teachers 
who coordinate the services and work with the chosen sponsor in a unique 
wa3^ . While continual modification is necessary in each setting to ensure 
that tho be^t possjble practices are implemented, it makes description 
of the experimental " Li oatiu^^n t " to be evaluated very difficult* 

vSchool districts are recommended for participation by sta^e educa- 
tion officials and are awarded grants by the U.S. Office of Eckication on 
the basis of political and adminis'.rative criteria unrelated to evalua- 
tion. School communities naturally choose a model from among those offei-ed 
by sponsors for reasons of their own, without regard to cxj^erimental design* 
These sponsored programs, which represent the only distinct part of the 
Follow Through "treatment" (since nutrition^ medical, and other service 
components must be present in every program but are not otherwise specified 
fypc?) > differ from one another in an unsystematic manner. 

Thus, it was clear by the time the evaluation began that the possi- 
bility of randomization in the assignment of students, teachers, class- 
rooms, schools, or projects v/as superseded by administrative decisions. 
Data collection procedures could follow planned schedules, but no exper- 
imental control over the specification and scheduling of experimental 

treatments" was possible; that is, treatments were defined by persons 
other than the experimenters, self-feelect ion of treatments occurred, and 
condition's of experimental independence were often violated* In addition, 
intensive efforts made to involve those f ami lies "mos t in need" posed a 
problem for the composition of adequate comparison groups. 

Evaluation of the Planned Variations 

The innovative "planned variations" idea is the unique aspect of > 
the FT experiment and the \^ey to understanding the plan for assessment* 
The fundamental purpose of the Follow Through experiment is to find 
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educational strategies that mi^^ht be used to improve the ef Tect iveriess 
of the American primary schools for disadvantaged children . Thus we have 
evaluations of alternative early education models that differ from one 
another and from the alternative offered by the primary grades of the 
present school systems. 

Each sponsor has designed a program of education or intervention for 
disndvant iiged chi Idren or a way of changing the ' signif ican t others" in 
their environments. Each sponsor has somewhat different immediate and 
intermediate objectives and different theo2'ies about child development, 
educa t ional d i sad van t ages , and education in general . Each al so has dif - 
f eren t methods of imp 1 omen t in g the program that he be 1 i eves will enhance 
the school perfoi^ance and presumabl}^ the "life chances" of poor or dis- 
advantaged children. The Follow Through evaluation provides the oppor- 
tunity for assessing these approaches only against a single set of criteria. 

Evaluation of the national FT program then consists primarily of 
detemining which approaches are effective in achieving a specified sot 
of developmental or educational objectives for children and a variety of 
changes in parent- community-schoo 1 relations. 

The specified set of objectives for children are the primary criteri 
foi the evaluation of effectiveness. But the evaluation also gives con- 
sideration to elements in the children's environment that influence 
development--f ami ly , neighborhood, and community setting as well as the 
school. Although the Follow Through program was initiated with the pur- 
pose of increasing the "life chances'* of the children, it is; only possible 
to evaluate performance on objectives presumed to be intermediate to that 
final goal. Objectives en which the sponsored educational alternatives 
can be compared are, broadly speaking, those that are held for all chil- 
dren at the end of the third grade. These are that children (1) be excited 
about learning, (2) feel good about t hemselves and thei r own competence , 
and (3) have mastered basic I'eading, language, and arithmetic skills that 
will help them to proceed successfully in the rest of their school ex- 
perience. 

The Follow Through evaluation lends itself primarily to policy deci- 
sions that deal with selecting nationally robust models for improving 
existing instructional programs for disadvantaged children. Federal 
education officials will presumably determine the most appropriate 



* 

The programs of some sponsors are not directly concerned with instruc- 
tion of children, but attempt to change school and community interac- 
tions. 
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educational models to offer in their compensator}^ education programs. 
Thus, administrators eagerly await information about which educational 
models raise achievement of disadvantaged youngsters in academic skills 
areas and which educational models create positive attitudes toward school 
on the part of poor parents and their children. The results of the ev^al- 
uation will be pertinent to such decisions when data from a large enough 
sample of children who have completed the educational programs associated 
with the several sponsors become available. 

Since it is possible that sponsored programs will not be equally 
effective in all situations (ranging from inner-city ghetto to rural 
Appalachia, from highly unionized to nonorganized teaching staffs), it 
will be important to establish evidence of relative effectiveness of 
programs on a project by project basis. An evaluation performed at this 
level (which must await the development of a far greater and more repre- 
sentative data base than is currently available) will provide a basis for 
decisions at local levels about which programs appear to bo most appro- 
priate to particular situations. 

Overall FT/nFT Evaluation 

Follow Through as a service program was designed to continue pro- ' 
viding comprehensive services throughout the primary grades to children 
who began receiving such a program in Head Start, the preprimary program. 
It attempted to ensure continuity between preschool and elementary school 
programs in terms of the full range of '^liTe support" services children 
required as well as the educational^ program. While the evaluation of 
Follow Through is primarily focused on identifying effective educational 
strategies, it should make it possible to determine whether children in 
Follow Through have an advantage over those without a Follow Through pro- 
gram. The answer to this broad question would have a bearing on policy 
decisions, such as whether to increase or decrease support for compre- 
hensive compensatory education programs in general. Earlier reports 
(SRI, 1971, 1972a) dealt with these questions more directly, but the cur- 
rent Follow Through interim evaluation permits the question of overall 
impact to be addressed. 



Naturally, answers to policy questions such as *^0n the average is it 
'worth it ' to continue to invest in comprehensive compensatory programs 
for disadvantaged children and their families?" are not resolved by 
research evidence but depend on the valuational criteria held and the 
frame of reference from which the facts are viewed . 
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One must remember that the only things common tc Follow Through 
treatments are that some (unspecified) set of nutritional^ medical, and 
other services supplemented some (at least nominally differentiated) 
experimental educational programs. In addition^ it should be pointed put' 
that when the 'treatment"*' is defined this loosely it is dif f i<?i).l't tp' 
distinguish "treated" groups €rom comparison groups. Poor children who 
are compared with Follow Through children are likely to have had a pri- 
mary grade supplemented by services under another name (Title I or 
Title III ESEA, hot lunch programs, etc). Under these circumstances, 
differential effects of Follow Through and Non-Follow Through "treatments" 
would be extremely difficult to detect » 
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DESIGN AND SAMPLE 

Th e Follow Through Evaluation Design 

The basic research design for evaluating the impacts of FT planned 
variations is a series of replications of longitudinal comparisons of 
treatments versus comparisons. This design essentially corresponds to 
Campbell and Stanley's (1963) design 10, or "The (Pretest-Posttest) non- 
equivalent control group design/' There are, however, a number of feature 
that complicate the Follow Through evaluation design and that make 
straightforward analyses and interpretation complex and difficult. 

Within a given replication of the basic evaluation design, measures 
are gathered on pupil? as they^^enter school, Follow Through, and the 
evaluation; and subsequent measures are gathered at the end of successive 
experi<?nce years in school and FT. Measures are also gathered at selected 
times on the families, teachers, classrooms and communities with which 
the pupils within a replication, are associated. The data elements of 
each such replication define a cohort in the evaluation. Those pupils 
in the evaluation who entered primary school in the Fall of 1969 consti- 
tute Cohort I, those who entered school in the Fall of 1970 constitute 
Cohort II, and so on. 

The Follow Through program is administered throughout the primary 
school grades, that is, from kindergarten through third grade. As such, 
Cohort I represents a 4-year experiment commencing in Fall, 1969, and 
terminating in Spring, 1973. Similarly, the Cohort II replication com- 
menced in Fall, 1970, with a new sample of participants and is scheduled 
to terminate in Spring, 1974. The design is complicated, however, by the 
fact that many of the participants within each cohort begin formal educa- 
tion not at kindergarten but at the first grade. Thus, two subgroups — ■ 
or "grade streams" — exist within each cohort — the Kindergarten subgroup 
and the Entering First Grade (i.e., those participants in schools that 
do not offer kindergarten) subgroup. Throughout this report, we will 
refer to these two separate subgroups within cohorts by their respective 
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grade levels at Entrance to the cohort — Kindergarten (K) and Entering 
First (EF),* 

The basic longitudinal evaluation design is siammarized in lable 1, 
This representation displays the relationship of evaluation cohort to year 
of entrance into the evaluation and successive experience years within 
the evaluation. The shaded area represents that portion of the total de- 
sign on which this report is based. Although four cohorts are indicated 
in Table 1, it should be noted that the basic design allows for an in- 
definite number of successive additional cohorts. 

TABLE 1 

Basic Follow Through Evaluation Framework 

Enter Year of Follow Through Experience 

Cohort Year First Second Third Fourth 



I 1969 



II 1970 



III 1971 



IV 1972 



Note that these definitions serve to distinguish two groups of partic- 
ipants having different, yet "normal" (for the school district), en- 
trance points into the experiment. These should not be confused with 
subgroups of pupils which "migrate" into a program at some point after 
these normal entrance points, e.g., pupils who transfer into or "enter" 
a kindergarten cohort at some point after kindergarten. These latter 
subgroups are not officially part of the -evaluation design. 

' • 20 - 




/ 



Each cohort is composed of a number of "projects/' or sponsored 
Follow Through programs. A project consists of one or more schools in 
which a particular program of services is being implemented. A given 
program of services includes as a main feature one of 22 sponsored models, 
or treatments, designed to "improve the life chances of poor children." 
Each project resides within a single school district, although occasion- 
ally more than one project resides within a single district. 

For each project participating in the evaluation, a non-project 
compari son--or control group--is selected and recruited for participation 
in the evaluation. Therefore, each cohort in the evaluation consists of 
a collection of treatment and compari son groups . The -collection of these 
tieatments comprise what is described as "planned variation," and this 
planned variation dimension constitutes t'le treatment variable in the 
overall evaluation design. 

Attempts are made to obtain comparison groups that 'have salient 
population characteristics reasonably similar to those of the project or 
treatment groups and that are within the same or proximate district 
boundaries. That is, to the extent possible, comparison schools are 
selected because of similarity with FT school characteristics, such as 
ethnic composition, general level of poverty of pupil families and type 
of neighborhood . The purpose for obtaining these matched comparison 
groups is to provide a basis for validly assessing the, FT program impacts' 
by contrasting measures obtained from comparison groups with those obtained 
from FT groups. Thus, if matching is successful, the only relevant vari- 
able' on which the two groups differ is FT, and differences on measures 
would oe valid indicators of FT ' s effects. But comparison group schools 
participated on a voluntary basis and since these comparson groups are 
constituted ufter a FT project is implemented and designated for inclu- 
sion in the evaluation, such matching was accomplished .with a highly 
'ariaole degree ^of success. However, the important point from a design 
considei*ation is that neither the assignment of treatments to projects 
nor the assignment of schools to treatment or comparison- groups is random. 

Among the implications of this non-random assignment of treatments 
to projects is the^^resultant imbalance of treatments across locations. 
That is, since projects are neither systematically nor randomly assigned . 
to treatments, no national or regional representativeness is assured. 
In actual fact , the imbalance of treatment3 across locations in the 
samples included in this interim evaluation shows projects as essentially 
nested within treatments. .This nesting relationship is displayed in/ 
Figure 1, which shows both the longitudinal an! hierarchical properties 
of the evaluation design, ilence, a given observation Xij.j^i represents 
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the. value of X associated with the, i'^^'p3.anned variation approach (Ai, 
A2t • • .An) implemented In the j'th location L2, . . .Ln) during the k"^^ 

experi^^nce year,(Y^, Yg, .....Yn) of the 1^^ cohort (Ci,.C2, ... C^) . 

The design logic for assessing the. impact of these planned variations 

is through pre-post '^comparisons of each treatincivt against its control. 

Jf sufficient coverage of the populatiofn distributions of disadvantaged 

children, their families, and communities is represented for treatments 

within cohort^, then further inter-approach.* comparisons .become possible » 

That is, the overall, relative impact can be. evaluated for those models 

^ ■ ... 

implemented in comparable sites and with other things essentially equ^l' 

(or equalized) . Also, the longitudinal property of^ the design enab^les 
'assessment of changes over time ^ while cohort replipations enable assess- 
ment of changes in quality of impl^mentatioji and associated effects. 

The Sample Subset for Assessment %of Interim Impacts 

^.The portion of the overall design that constitutes the basis for 
this interim report extends from the 1969-^70 through the 1970-71 SQhbol 
year, or the first t\yo rows in Table 1. 'As such, the first -two yea*rs 
of impact are being assessed for Cohort I^ (Fall, 1969, to Spring, 1971)*, 
and the first year of impact is being' assessed for Cohort II <Fal Ij/^QTO , 
to Spring, 1971). j ' . ^ ' 

^■According to the pre-post design, premeasures are gathered onfall . 
members of a cohprt — -.treatment' and comparison — at the time they enter the 
evaluation. Subsequent postmeasures .are gathered on selected subsets 
of these cohorts a^t varioufe later times. . ' % - 

Since intermediate '^osttesting/' or data gathering, is not conducted 
on the total cohort sampile, the interin) assessment" is restricted to those 
components and participants that have been* measured . The decisions^'as 
to which and how many s'ubsets woijld participate, in interim measurement 
were based on a variety of administrative arid financial ^considerations 
(SRI , 1972b)'' and ef f ectively , die tate the sqope and general! t3;L..g| ^?11 
interim assessm'ents • ^ That ^j.s, the sampling and ifteasurement^^design for 
assessment of ihterim effects do^s not , match;. Ithe scope and magnitude of 
the overall evaluation design, ais schematized in Figur^ 1. The net con- 
sequence of these reductions in interim data will' be a corresponding rer- ^ 
duct ion in the interpreta-ility. and generality of in^terim findings. 
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L^. L^. = the locations in which a glvan approach or planned variation 
is implemented (i.e., the project) 

c = the treatment and comparison schools within the project 
location 

Y the measurement points corresponding to baseline (Y ) and 
experience years (Y^ - y^) 

C, C " the cohorts of the total evaluation design 
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FIGURtE 1 THE NESTED PROPERTIES OF THE LONGITUDINAL 
EVALUATION DESIGN 
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The Project SpjTipling Scheme 



Selection of projects for inclusion in the evaluation sample required 
judgments and ae^cisions by the planned variation sponsors, USOE/FT, and 
SRI. In the initial phases of the evaluation (1968), the planned varia- 
tion sponsors were permitted and encouraged to designate projects for 
both certain inclusion and certain exclusion from the evaluation sample. 
The primary criterion for sponsors' judgments was the state of model imple- 
mentation as it could be estimated at that point. In general, sponsors 
requested inclusion of projects in which implementation appeared to them 
to be progressing well and requested exclusion of projects in which imple- 
mentation difficulties were being encountered. USOE/FT also influenced 
the composition of the ^valuation sample by designating various projects 
for certain inclusion or certain exclusion in addition to those so desig- 
nated by the sponsors. 

Finally, SRI selected additional projects from among the residual, 
following inclusion and exclusion specifications by sponsors and USOE/FT. 
The principal sampling criteria employed by SRI were: 

(1) To obtain at least five projects (if available) for each 
planned variation 

(2) To maintain the 3:1 distribution of K to EF projects repre- 
sented in the total FT program. . . 

(3) To obtain representative geographic and urban/rural balance. 

^4) To avoid impractical situations, such as locations where 
comparisons were unobtainable. 

In June. 19'<b, an additional sampling constraint was placed on 
project selection; namely, any project would be excluded from the evalua- 
tion sample during its first implementation year with a given spqnsor. 
■jThis rule had retrospective consequences on data collected before its 
formulation, as is noted below. 

A complete description of the implementation of, this sampling year 
requires reference to 1968-69, during which many of the above criteria 
were initially employed in selecting projects for participation in the 
evaluation. In particular, 1968 was the first year of sponsor partici- 
pation in Follow Through. From the total of 106 projects in Follow Through 
at the beginning of the 1968-69 year, sponsors designated nine projects 
for certain inclusion; these were projects in which the sponsors felt that 
implementation was proceeding well and which should be included in the 
evaluation. Sponsors also designated 17 projects for certain exclusion 
in 1968-69 since difficulties of various kinds were being encountered 
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in implementing the Follow Through program or the specific model. In 
addition to the 26 projects specified for either inclusion or exclusion 
by the sponsors, USOE/FT designated. 5 projects for certain inclusion 
and 10 projects for certain exclusion. From the remaining 65 projects 
(i.e., ones neither specified for inclusion nor exclusion), SRI selected 
35 to satisfy the remaining sampling criteria of frequency and balance 
across sponsors, regions, and grade streams. The total sample included 
49 projects in 1968-^9. 

Because 1968-69 was subsequently designated as an "implementation^ 
Year," data collected on these 49 projects during that year were excluded 
from evaluation. All but two of these projects were, however, part of 
the subsequent Cohort I (Fall, 1969) sample. 



Cohort I SampI§..,XEla3r3rr-4969) 



Th^e^' base line sample for Cohort I consisted of 90 projects ■ that were 
sele^c^ed in Fall, 1969. At that time, all entering pupils were tested.* 
Of /fhese 90 projects, 47 we,re sampled because they had been tested in 1968, 
arid 42 were selected on the basis of the other sample inclusion rules. 
^/This resultant sample contained 61. K projects, 28 EF projects and one 
project classified as both K and EF . However, on^ the basis of the eligi- 
bility policy formalized in June, 1970, 38 oi these 90 projects became 
ineligible for inclusion in the Cohort I evaluation sair' le , This post 
hoc reduction impaired the balance of the sampling ^design implemented in 
selecting the original 90 Cohort I projects.' ^ 

This overall pattern is displayed in Table 2, which shows the dis- 
tribution of projfect sampler by sponsor for each of three measurement 
periods — baseline (Fall, 1969), first year (Spring, 1970) and second year 
(Spring, 1971) . Entries for Spring, 1971 are further subcategorized to 
show those Cohort I projects for which both first- and second-year mea- 
surements were collected, and those for which only second-year measures 
were obtained. 



It should be noted that the Fall, 1969, data collection included tests 
of entering (K, EF) and of intermediate (1st, 2nd, 3rd and 4th) grade 
pupils. As such, the Cohort I baseline sample reflects only a subset 
of the overall activity. 



TABLE 2 



DISTRIBUTION OF PROJECTS BY SPONSORS ACROSS 
MEASUREMENTS FOR THE COHORT I EVALUATION SAMPJJE 









NUMBER OF PRQiTECTS 








FALL 


1969 


SPRING 1970"" 




SPRING 1971 






AFTER 


AFTER 






SPONSORt 


INITIAL 


EXCLUSION 


INITIAL EXCLUSION 


1- & 


2-YR 2-YR ONLY 


SS 


8 


7 


5 5 


5 


1 


FW 


5 


4 


2 2 


2 


1 


UA 


7 


6 


3 3 


3 


1 


BC 


7 


.6 


3 3 


3 


3 


UG 


3 


3 


0 0 


0 


2 


UO 


9 


6 


3 3 


3 


3 


UK 


9 


4 


2 2 


2 


2 


HS 


5 


3 


2 2 


2 


1 


UF 


5 


3 


3 2 


2 


1 


ED 


6 


4 


2 2 


2 


1 


NY 


2 


2 


1 1 


1 


1 


SW 


3 


2 


2 2 


2 


0 


PI 


5 


1 


1 1 


1 


0 


ALL OTHERS* 


16. 


_1 


_2 _JL 


_0 


1 


TOTAL 


90 


52 


31 29 


28 


18 



Not in the resultant collection of planned variations included in 
this. interim analysis . 

'''Refer to Executive Summary for full titles of sponsors. - 



The Cohort I sample can also be distributed in terms of grade streams 
as follows: ' 



FALL 1969 SPRING 1970 SPRING 1971 



AFTER AFTER 1- & 2-YR 

• INITIAL EXCLUSION INITIAL EXCLUSION 2-YR ONLY 

K 61 35 .19 li? 19 14 

K & EF 1 1 „ ' 1 

EF 28 i£ i? 12 _9 3 

TOTAL 90 52 31 29 28 18 
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Finally, the Cohort I testing pattern can be 
number of projects involved in four distinct 
residual or non-test group) as follows: 



summarized in terms of the 
testing patterns (and one 



MEASUREMENT POINTS 

• FALL 1969 SPRING 1970 SPRING 1971 TOTAL 

TESTED AND REMAINED X X X 28 

ELIGIBLE X - X . 18 

TESTED BUT BECAME X X - 3 , 

INELIGIBLE OR WERE X - - 41 . 
PURPOSIVELY EXCLUDED 

NOT TESTED - - _ 72 



Hence, 28 projects constitute the subset for which both first- and second- 
year measures were obtained, 18 projects the subset for which only second- 
year postmeasures were obtained, and 44 (41 + 3) the subsets for which 
only first-year or no postmeasures were obtained. 

Cohort II Sample (Fall, 1970) 

The June, 1970, USOE/FT eligibility rule was used to select the 
Cohort II sample, in addition to the other considerations. Eight projects 
that violated this eligibility rule were included for special purposes, 
such as obtaining measul'es, in the fall on participants in summer programs 
and o])taining information on participants who had previously taken part/ 
in specific Head Start Planned Variation programs. This resultant Cohort 
II sample is displayed in Table 3, which shows that a total of 107 proj- 
ects wer^ included in the baseline sample and that eight of these were 
ineligible but were included for special purposes. Of these 107 projects, 
28 were included in the Spring, 1971, testing. This CII sample is also 
distributed in terms of the K and EF grade streams in Table 4. 

In summary, substantially fewer project samples w^re included in 
Coho. t I and II interim evaluation than were initially selected. The 
original plan was designed to include those projects considered exemplary 
by sponsors, lecessary or essential by USOE/FT, and representative in terms 
of the evaluation design by SRI. One reason for this reduction was the 
establishment of an eligibility rule in June, 1970, which specified that 
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TABLE 3 



SmiMARY OF THE COHORT II SAMPLE DISTRIBUTED 
ACROSS SPONSORS AJND MEASUREMENTS 



SPONSOR t 



ELIGIBLE 
AND TESTED 



FALL 1970 



TEflTED BUT 
NOT ELIG.* 



TOTAL 



vSPRING 1971 
ELIGIBLE 
AND TESTED 



ss 


7 


1 


8 


1 


FW 


9 




9 


3 


UA 


8 




8 


3 


BC 


10 




10 


5 


UG 


3 




3 


0 


uo 


9 




9 


4 


UK 


9 




9 


3 


. HS 


5 


3 


8 


2 


UP 


7 




7 


3 


ED 


8 




8 


2 


NY 


2 




2 


2 


SW 


3 




3 


0 


PI 


3 


1 


4 


0 


OTHERS* 


16 


3 


19 




TOTAL 


99 


8 


107 


28 



These projects (tested but not eligible for evaluation) were in- 
cluded in the test sample for special purposes such as assessment 
of summer effects, Head Start Planned Variation and so on. 

^Refer to Executive Summary for full titles of sponsors. 
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TABLE 4 



THE COHORT II S.AMPLE DISTRIBUTED ACROSS 
GRADE STREAMS AND MEASURES 



FALL 1970 SPRING 1971 



AFTER AFTER 
INITIAL EXCLUSION INITIAL EXCLUSION 



i\ 77 71 20 20 

K AND EF / : 5 4 1 1 

EF ' 25 24 . 7 7 

TOTAL 107 99 28 28 



projects must be affiliated with a sponsor or planned variation for at 
leksv one year before being included in the evaluation sample. This 
eligibility rule primarily affected the Cohort I sample. Further i-educ- 
tions occurred because our evaluation design specifies that only subsets 
of cohorts be included Th measurements of interim effects. The conse- 
quences of these two reductions are seen both in the concomitant reduc- 
tion in the scope of interim findings and consequent ability to general- 
ize from them, and in the statistical precision with which any effects 
can be detected. These consequences are discussed more fully in Annex A, 
"issues in the Analysis of the Data." 
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Section III 



• METHODOLOGY 

In troduction 

I 

In this section we will describe the general and specific leatures 
of the evaluation, instrumentation, and collection of data, construction 
of variables, and methods of data analysis. Where appropriate, we will 
distinguish between specific methodologies that were implemented in this 
1969-1971 interim evaluation and the general methodologies. 

This section is organized into the following four subsections: 

*: 

(1) Instrumentation and data collection 

(2) Procedures 

(3) Definition and development of evaluation variables 

(4) Analysis methodology. 

1 

Instrumentation and Data Collection / 

Follow Through is a complex, broad-scale educational experiment. 
As such, a great variety of its qualitative and quantitative components 
are of interest. It was clear early in the plaguing and preliminary 
evaluation activities that for FT to be evaluated as a total • program , 
more mus t be measured th an the par tic ipat ing ch ild * s academic progress , 
Furthermore, since evc^.luation interest would be focused on identification 
of "components" of "siiccessf ul* programs, attention would need to be 
given to evaluating the process as well as the outcome. ^-Also, of course, 
some minimum level of descriptive data would be essential. 

Six basic sources of data , each of wh ich corresponds wi th separate 
instrumentation and data collection procedures, were employed in develof)- 
ing this evaluation evidence . They are the following : 
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• Classroom roster 



• Follow Through test battery 

• Parent interview 

•* Teacher and aide questionnaire 

• Classroom observation procedure 

• Project descriptor inventory. 

Much of the basic evaluation effort was spent on developing and refining 
such instruments and procedures. Their purpose and contents are briefly 
described in the paragraphs to follow. The nature of the . evaluative 
questions and the focus on children's progress as the principal measure 
of effectiveness determined the nature of the instrurtients used to collect 
data. Clearly, this set- of instruments does not begin to exhctust the 
types or composition of instruments that could be employed in an evalua- 
tion of all the aspects of Follow Through. 

*\ 

The Classroom Roster and Related Information Form 

The classroom roster provides a straightforward and relatively 
reliable source of several categories of information about the pupils, 
serves as a cross reference for certain data that are collected through 
other sources, and provides a basis for determining program census, 
migration, and attrition throughout the evaluation. Specifically, this 
instrument is a- listing of the classroom pupils by name, age, sex, ethnic 
group, language spoken at home, preschool experience, and amount of FT 
services received, if any. Other items of information available from 
each properly completed roster arei classroom identifiers (room number, 
principal, school, address, district), classroom staff (teachers, aides, 
volunteers) and evaluation design information (cohort, grade stream, 
grade ?evel, and condition — FT versus NFT), 

The roster form remained substantially unchanged from year to year 
throughout the ^evaluat ion , although several minor changes were made to 
facilitate ixs completion and to improve the clarity of data obtained. 

The Follow Through Test Battery 

The principal source of evidence for program imp^t on f)upils is 
the Follow Through Test Battery, This battery is administered twice 
each year— in the fall to obtain baseline information on children enter- 
ing the program, and again in the spring to obtain progress and/or 
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outcome information on children progressing through or exiting from the 
program. The same test instruments are administered to both FT and com- 
parison classrooms within each grade level. 

The contents of the FT battery have been changed from year to year 
of' administration, and, of course, differ across grade levels within 
each year of administration. Nonetheless, the cognitive and non- 
cognitive domains for which the instruments '/ere selected arc consistent 
both within and across yt^ars. Changes in the battery primarily reflect 
attempts at improved measurement, both in terms of reliability and 
validity of data. 

Cognitive Measures -^T^he instruments tliat were included at one or 
more levels of the test battery and that provided measures of perfor- 
mance within the cognitive domain aVe described in the following paragraphs 

The Wide Range Achievement Test. QVRAT) is a multi-level achievement 
test designed for individual administration to younger pupils (those 
attending kindergartens and first grades) and possible group administra- 
tion to older pupils (th se attending second and third grades). Essen- 
tially, the \VRAT provides a means of using a. single instrument for pre- 
post achievement assessment, ^ although it is unlikely that certain items 
appropriate for kindergarten would also be administered to third grade 
pupils. 

The \VRAT was designed to provide ' measures of achievement in three 
basic skill areas — rreading, spelling, and arithmetic. Although the 1965 
version of the test is, standardized and normed , it contains several 
relatively unconvent iqnal features. First, the test is interactive; 
that is, the set of questions the student is asked depends on how he 
performs on certain irems. Second, the test is of variable length for 
each pupil; testing is .continued until particular error runs occur. 

Third, normative conversions are supplied for separate subtests but not 

/ 

for a total score. Finally, the appropriateness of these norms for the 
Follow Through Evaluation Sample was questioned since they were based 
on a norm sample of less than 2000 pupils for the age range participating 
(five to eight years) and, according to the technical manual, "No attempt 
was made to obtain a representative national sampling." (Jastak and 
Jastak, 1965, p. 9). ' 

Because of the need to establish and follow uniform, rigorous, and 
replicable testing procedures and because of concern over certain mea- 
surement and evaluation issues (such as, the need for 1? successive errors 
as a criterion run,''^the adequacy and appropriateness of the norm conver- 
sions, and administrative problems of implementing the quasi-branch 
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methodology), a modification of the WHAT was developed by SRI and used 
in this evaluation. This modification, based on age-equivalent standard- 
1 ization data, created four overlapping versions of the test or one for 
each grade level. Each such gradp level of the WRAT was generated by 
including all items up to a point corresponding to two standard devia- 
tions above the standardization a^^e-equivalent sample,' Also, some 
modifications in the sequence and lexicography of items w.ere made to 
improve adminis tr abi 1 i ty , All these changes were ^w ade in consul tat ion 
with the authors and publisher. Furthermore,, minor subsequent revisions 
in the instrument were made based on item analysis of data following 
wide-scale administration. These modifications essentially adjusted the 
limits within the grade level, i.e., adjusted the overlap. 

Information on specific items in the WRAT tesl^ in terms of average 
i tern ,d if f icul ty and variability of responses to each item within each 
grade level is siunmarized in Annex B of this repoit. Annex B also in- 
dicates the item overlap for the separate grade levels, 

The Pre^ school I nven tory was administered to all pupils at entrance 
and at the end of the first year of the program. This instrument was 
originally developed by Bettye Caldwell for ETS use i;p the study of 
Early Educational Programs. The instrument was designed to survey the 
level of conc3ptual development and general information and rudimentary 
basic skills -present in each child. The test is individually administered 
and has not been nat tonally normed. The items are, in general, appro- 
priate for a preschool (e.g,. Head Start) population. Thus, test scores 
approach an asymptote beyond kindergarten (i.e., there is a ceiling effect 
for first grade). Since the instrimient measures general basic skill per- - 
formance, item sampling procedures were implemented.- This resulted in 
a rediid^ion of test length from 64 items to a final set of 29. Statistics 
on these 29 items are summarized in Annex B of this report. 

Another source of information regard ing incoming skills or 'en ter ing 
behaviors'* of the evaluation participants was an adaptation of the Lee - 
Clark Reading Readiness test. This test primarily assesses the child *s 
skill at letter and word discrimination, matching, and oddity discrimina- 
tion. This was administered only during, .the entering year and in a group 
mode, I tem s ta t is t ics are presented in Annex B of this repor t . 

A third instrument administered only to pupils in their first year 
of the program is based on items developed by Martin Deutsch and asso- 
ciates 111 the N.Y,U, Ear 15^ Childhood Inventories project, Vhese items 
require number and letter discriminations and recognitions and can be 
considered pre-reading and pre-math, much like the PSI items. The test 
contains 31 items and was grc^up administered. Item statistics are sum- 
marized in Annex B of this report. 
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A total of six subtests contained in the 1965 version of the 
N>o tr' opol itan Readiness Test were administered toT'pupils in th'^ir second 
year of tho program (i.e., non-entering first graders and second graders 
in 1970-71). These subtests include measures of word meaning, listen- 
ing, matching, alphabet identif icati'(;)n , numeration, and symbol copying. 
This test was group administered and consists of 38 work items and ap- 
propriate practice i tems . It em statistics are summarized in Annex B of 
this report. ' 

in addition to the Metropolitan Readiness items, selected S tan ford 
Achievement Test items (1964 revision) and Metropolitan Achievement Test 
items (1958 revision) were administered to piiplis in their second year of 
the program. This SAT-MAT subtest consists of 20 word reading items , 
and 2 0 arithmetic computation items. The test was group administered 
and allowed "eight minutes for word reading and seven miniates for arith- 
metic. Item statistics are summarized in^ Annex B of this report. Also 
included in Annex B are summary statistics describing the measurement 
properties of those instriinicnts when aggregated into specific varialplcs , 

'■ 

Sponsor Contributed I tems --To guard against the FT Test Battery's 
failing to cover items relevant to sponsors* objectives, attempts were 
made to solicit sponsor-contributed test items and to incorpora'te them 
into the upper grade levels of the battery (first grade and beyond). ' 
These items are labeled sponsor items and vary from imbedded figures 
tests to measures, of word reading, numeration, concept identification, 
alphabet skills, language/reading skills, set operations, straight 
arithmetic abilities, verbal anaJ.ogios, and so on. These tests were 
individually administered. Item difficulty and response variabil i ty 
are summarized in Annex B of this ,^eport. 

Noncogniti'^e Measures --A substantial interest in the collection and 
analysis of noncognitive — or^f feet i ve--ind ices of program impact was 
expressed early in the evaluation planning. However, several difficultie 
soon became apparent. First; unlike cognitive measures (i.e., achieve- 
ment, intelligence, aptitude, read iness , ' e tc . ) , noncognitive instruments 
have not emerged In widely accepted or standardized forms. Whether thi? 
is due to inherent dif f iculti'es in developing such instruments or in the 
lack of prior focus on the domain is irrelevant; the fact remains that 
no instrument comparable to the WRAT exists for noncognitive assessment. 

Rather than abandon the domain as currently unmeasurable , limited 
noncognitive ins trumentation ' was included in the battery, and a research 
program for che identif'ication and evaluation of alternative noncognitive 
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measures vvas" instituted (SRI, 1970, 1972c, 197 2d) . The results of all such 
^-nstrument evaluation studies to date are generally discouraging, suggest- 
ing that although the limited ifistrumentation initially incorporated in 
the^. battery is far from sufficient as a valid and reliable measure of 
affect, no better methods or instruments short of a clinical interview 
are currently available. 

The limited noncognitivie instrumentation included in the battery is 
called the " Faces ' Atti tude Inventory . It consists of a series of self- 
report questions, in which 'the. pupil is asked * to indicate how he feels 
about himself, oth'ers, school, and learning, and how others feel toward ^ * 
him, i.e., his' peer group status^ and so on. To provide each, pupil, even 
an entering kindergar tner , with , a relatively unequivocal means of indicat- 
ing his feelings to ea -h such item, the tester asks each student to mark 
a face (smile, so-so, frown) that corresponrf^s to this feeling. This te^t 
was group administered to pupils in the .1969-1971 assessme.tit period. 
The same instrument was administered to all grade and experience year 
levels. . ' • ' , . * 

The contents of .the overall FoIIoav Through Test Battery as adifiinis- 
tered over the .period of Fall, 1969,' through Spring, 1971, are summarized 
in Table 5^1^ This table lists the test contents in terms of i\tem so.urces 
for each level relevant to this interim evaluation. Also displayed in 
Table 5 are the maximut scores obtainable for each it^m source. Changes 
in these maximum scares reflect longitudinal (year to year) as well as 
grade level difference's 'iD^ the overall^ battery . /'I t should be mentioned 
that auring the Fall 1969-Spring 1970 test interval pupils at grade 
levels besides kindergarten and entering first were being tested. The 
scope of the testing effort extelided to first, seconjd, chlrd, and fourth 
grade pupils in groups 1 a teifT' excluded trom the evaluation Sample. A 
description of the entire data collection effort is beyond the scope of 
this report and is mentioned here only to provide a. context for the 
Cohort I 4:esting. 



based on subsequently developed research evidence, this and similar 
noncognitive instruments* were individually administered to all kinder- 
garten and first grade pupils from Fall, .1971, onward. 
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TABLE 5 



Si;MM.\flY OF FT T 3T BATTERY—FALL, 1969-SPRING, 1971 



IK)MAIN 



COHORT I 



COHORT. II 



FALL, 1969-SPRING, 1970 



K 



1ST (EF) 



NONCOGNITI VE 
Mi-ASUHE 



\VRAT (84) 

PS I (32) 

LEE CLARK (16) 

NYU (22) 

FACES (21) 



WRAT (84) 

PS I (33) 

LEE CLARX (16) 

NYU (22) 

FACES (21) 



(NOT YET IN 
PROGRAM) 



FALL, 1970-SPRING, 1971 



1ST 



2ND (EF) 



K 



WHAT (112) 



SPONSOR (30) 
MKTKO RE.-VDI- 



WHAT (149) 

SAT/MAT (40) 

SPONSOR (69) 

I^ffiTRO READI- 
NESS (26) 



1ST (EF) 



TOAT (84) WRAT (112) 

PSI (41) PSI (34) 

LEE CLARK (14) LEE CLARK (16) 

NYU (21) NYU (20) 

SPONSOR (14) 

METRO READI- 
NESS (32) 



i-ACES (21) 



FACES (21) 



FACES (21) 



FACES (21) 



u pa ren til OSes reflect maximum possible sc6re, 



■ r h e i\i!'ont Interview 

•■ ■ ' . 

"Liiv primary .source of evidence for assessment of program impact on 
■5i;tr*'5:*, iu>i{U' , and community frctors is the pare^ . interview. This inter- 
vit'V. CMnductcd on an in-person basis and -is administered by the 
N;i;>.o^^.\i Opinion liesearch Center (NORC) under subcontract to SRI. These 
intervK^s are (Conducted in the early spring of each year and concentrate 
on sair:>ling amon^^ parents of children entering the evaluation (both Fol- 
io'iv llirtuit^h and Comparison). In Spring, 1970, nearly 9,000 Cohort I 
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interviews were conducted,* and in Spring, 1971, nearly 15,000 parents 
were interviewed,, a subset of whom had children in Cohort II. * 

Within the overall Follow Through evaluation the interviews with 
parents of Follow Through children and non-Follow Through children serve 
four main purposes : 

(1) Provide information from which to estimate the initial 
comparability of Follow Through and non-Follow Through 
children and families according to socioeconomic , ethnic , 
and other demographic char acteris tics , and to ad jus t s ta- 
tistically for noncomparability since random assignment was 
not possible . 

(2) Provide information for sorting respondents into subgroups 
so that possible interactions between treatment and parent 

( characteristics can be examined. 

' (3) Develop indicators of parental beliefs, expectations, and 
practices that characterize family life styles which may 
be influenced by or may mediate the effects of Follow Through 
program participation. Some Follow Through programs are 
specifically designed to bring about a considerable degree of 
parent involvement and parent education while others empha- 
size these goals only minimally. 

(4) Ascertain the parentis knowledge of, participation in, and 
satisfaction with school programs in general and Follow 
Through in particular. 

SRI, OE , and consultants selected a set of questions that, in their 
judgment and with the constraints operating at the time, best served these 
purposes. These items made, up the first Parent Interview (Spring, 1970). 
The first Parent Interview, items provided data in the following ten areas: 

I 

• Demographic data . 

• Interest and knowledge about FT 

• Participation in making policy with respect to educational 
programs 



During Spring, 1970, over 14,000 interviews were conducted. Decisions 
subsequent to lata collection delimited the sample of interest. 
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• Contact with the school and teacher. 

•* Feelings about ability to control one*s life 

• Support and guidance of child with respect to educational 
programs 

• Extent of educationally relevant stimulation in the home 
environment 

• Number of types of programs available to ch ild in communi ty 

• Aspirations . 

In the second Parent Interview, used as a source of data for this 
analysis (Spring, 1971), the following 11 content areas were used: 

• Demographic data 

• Awareness of what is going on in child's classroom 

• Participation in policy making with respect to educa- 
tional programs 

/ ■ *■ 

• Parental involvement with educational components of 
planned variation 

• Feelings of being able to control one's life 

^ • Support and guidance of child with respect to educational 
items, and extent of relevant stimul.ation in the home 
environment 

• Satisfaction of parent with school 

• Satisfaction of child with school 

• Expectations/anticipations/goals of the parents 

• Feelings of efficacy in relation to the school 

• Social life style. ' . 

Teacher and Aide Questionnaire 

The Teacher and Aide Questionnaires were developed to complement 
and support data gathered through other measurement instruments of the 
Follow Through evaluation and to learn whether teachers in the Follow 
Through prograni were changing. First, the questionnaires were designed 
to provide profiles of the teachers in terms of demographic characteris- 
ticfe, the training and support they received, and the goals and attitudes 
they held. Second, the questionnaires could provide information about 
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the different effects of different Follow Through sponsors* programs. 
As such, surveys were made of teacher practices and attitudes during 
both tlie Springs of 1970 and 1971 using a teacher questionnaire that 
included questions in the following areas : 

• Demographic information and background 

• Classroom practices 

• Availability and use of equipment and materials 

• Educational goals for children 

• Information and attitudes about home visits and parent 
participation in the classroom 

• Knowledge about Follow Through, manner of involvement with 
the program , and opinions about its effectiveness 

• General assessment of pupil progress. 

Some instrument revision occurred between the 1970 and 1971 adminis- 
trations, and only data for the 1971 administration are included in this 
interim analysis. The nature of the 1971 revision reflected an increased 
interest in teacher characteristics and practices, and in^classroom 
composi t ion . ^ 

For example, portions of the instrument dealing with teacher char-- 
acteristics included items designed to assess : 

• Educational background 

• Teaching experience 

• Educational goals 
^ • Motivation 

• Training prior to current school year 

• Training during school year 

• Suggestions for improving training 

• Help provided by Sponsor 

• What teachers know about Follow Thorough 

• Discussion of Follow Through with others. 
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Classroom characteristics were assessed through items dealing with: 



• Length of class day 

• Number of children at present and all year 

• Languages spoken by the children at home and in the classroom 

• Teaching approaches /and techniques 

• Field trips taken. 

Specific questions about parent involvement included : 

• Peirent participation in classroom 

• Number of teacher visits to parent and reasons for them 

• Number of parent visits to school and reasons for them 

• Teachers' feelings about the importance of getting together 
with parents outs ide s chool . 

Finally, questions about the progress of children included teachers 
estimates of the progress of children in their classes on various cogni- 
tive and noncognitive characteristics. However, teachers were not ques- 
tioned about individual children. 

The Classroom Observation Instrument 

The SRI Classroom Observation Instrument (COI) is an elaborate 
event' recording procedure by means of which a trained observer in a 
classroom records interactions among teachers, aides, and children, and 
also records setting, kinds of activities, and groupings within the 
classrpom. The instrument was developed to-be appropriately sensitive 
to a broad range of activities characterizing the programs of a subset 
of Follow Through sponsors, while retaining adequate reliability. 



Sponsors included in the 1970-71 sample for classroom observations were 
Bank Street College, University of Kansas, Univer;;lity of Oregon, Uni- 
versity of Florida, Educational Development Center, New York Univer- 
sity, Far West Laboratory for Educa^t ional Resear :h and Development, 
High/Scope Educational Research Foundation, and the University of 
Arizona. * 
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The SRI Classroom Observation Instrument has three major parts--an 
Observation Summary Form- (OSF) ior describing the physical environment, 
a Classroom Check List (CCL) , and a Five-Minute Observation (FMO) lorm. 

The CCL, sometimes referred to as the "snapshot," attempts to record 
relatively static pictures (four an hour) of the dis~tr ibution of adults 
and children among activities in the classroom. Essentially, the CCL 
assesses (1) activities occurring, (2) materials used in activities, ; 
(3) grouping patterns (4) teacher and aide responsibilities, and (5) ch i 1 
drcn working independently, ' 

The FMO record of interactions is completed four times an hour (i.e. 
following each CCL). It requires a sj^mbol to be marked for (1) wlio does 
the action, (2) to whom it is done, (3) what is done, and (4) how it is 
done, A complete unit of interaction is described when the coded cate- 
gories are strung into a sentence structure format or frame. This frame 
is a sequence of "parts of speech" or subject, object, verb, and adverb. 
Tlio "who" and "To Whom" codes make it possible to designate the person 
or group of persons initiating or receiving an action. 

The 12 "what" codes refer to categories (e.g., question, response, 
instruct) that survived several iterations of instrument review with 
consultants and sponsors* representatives to ensure that it captured 
classroom interactions considered educationally significant. 

The first four items in the 'How" code refer to the affective as- 
pects of an interaction between people or with materials. The next six 
items refer generally to strategies the tearcher may use to control be- 
liavior in her classroom. The last two "How" items — "concrete objects" 
and "symbolic ob jec ts"--were added to capture an important distinction 
among instructional strategies made by certain sponsors who believe chil- 
dren mus t learn first from experiences wi th concre te ob jec ts before 
proceeding to experiences with symbols (ideas). 

The FMO frame, then, is to record an interaction as a "sentence" in- 
cluding "who. To Whom, What, and How." Figure 2 shows the numeric and 
alphabetic symbols, or codes. And the:^ brief definitions. Operational 
definitions and exampli3s are containe'd in the complete Training Manual. 
Figure 2 also shows two sample frames recording teacher-pupil interac-^ 
tions. . 

Coding a frame xequires the observer to make a. dot with a felt- 
tipped_marker pen on the appropriate symbol. Special mark-'sense forms 
were developed and adapted to the procedure to fqicilitate accuracy and 
reliability of the observation/recording process. 



44 



CODES USED DN CLASSROOM 
DBSERVATION INSTRUMENT 



R 


-Repeat 




C- 


-Cancel 


Who and To Whom 




How 


T - 


Teacher 


H 




Happy 


f>, - 


Assistant/Aide 


S 




Sad 


V - 


Volunteer 


N 


I 


Negative 






A 


- 


Angry 


C - 


Child 








D - 


Different Child 


G 




Guide to alternative 


• 2 - 


Two Children 


R 




Reason 






C 




Control by praising 


S - 


Small Group 


Q 




Question 


L - 


Large Group 








E - 


Everyone 


F 




F irm 






D 


- 


Demean 


M - 


Materials 


Th 




Threaten 


0 - 


Confusion 


P 




Punish 






T 




Touch " 






0 




Ob|Qct 


What 






Symbol. 


1 - 


Direct request 








2 - 


Choice request 








3 - 


Respond 








4 - 


Teach. 1 nform 








5 - 


Comment, play 








6 - 


Praise . 


V 




Verbal 




Acknowledge 


N/V 




Non-verbal 


7 ~ 


Help 








8 - 


Cooperate 








9 - 


Corrective feedback 








10' - 


No response, Ignore, 










"1 don't know" 








11 - 


Refuse, Reject 








12 _ 


Observe 








0 - 


Confusion 









EXAMPLE OF CODED INTERACTION FROM 
CLASSROOM OBSERVATION INSTRUMENT 

Teacher: Mary, If you had 2 pennies and your mother 
gave you 2 more, how many would you have? 
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Who 


To Whom 


What 


How 
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Child: 


would have 4 pennies. (Smiling) 
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Who 


To Whom 


' What " 


How 
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FIGURE 2 



Project Descriptors , 

A variety of categories of data descriptive of communities, school 
systems and projects were collected and assembled for inclusion in the ^ 
interim evaluation. 

The main sources of descriptive data for the projects included in 
this interim analysis were the Census, Consolidated Program Information 
Report (CPIR) and Elementary and Secondary General Information Survey 
(ELSEGIS) systems, NCES documents, individual Follow Through projects' 
annual applications, progress and interim reports, and data already 
available in the SRI Follow Through data bank. 

Data from the 1970 Census were available as published documents and 
on magnetic data tapes. However, because of difficulties with system 
compatibility and software, only published census i*eports were utilized. 
These included the following final Census report's: 

PC(1) — A-Number of Inhabitants 

HC(1) — A-General Housing Characteristics 

PC(1) — B-General Population Characteristics 

Similarly, anticipated software and system compatibility problems 
precluded use of CPIR 'and ELSEGIS data tapes. As such, validated list- 
^ili'gs of all^ ELSEGIS and CPIR data obtained for school systems and 
individual schools in the interim sample were used. These validated 
listings of data are essentially data (raw) tables in a format siirtila.r 
to that of the original instrument, but which have been subjected to 
editing and verification. 

«a 

CPIR collects and enumerates statistical information aggregated at 
the school district level for alJ. Federal title expenditures on the fol- 
lowing (quoted from Federal State Task F orce . . . , 20) : 

(1) Number of children an1 numb^^r of schools in the district 
by pupil population groups, grade levels, and services and 
activities provided ; 

(2) ' Number of staff members by activity and pupil populations 

served, number of staff members participating in Federal 
programs, and Federal dollars expended on in-service 
training by source of funds; 

(3) Dollars expended, by source of funds,- pupil population 
groups, services and activities provided, and dollars by 
age/grade level; 

./ 
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(4) Supplemental information appropriate to specific orograms , 
such as the ESEA Title III. 

On the other hand, the ELSEGIS system is primarily designed to col- 
lect data on the individual school system and analyze these data by 
enrollmeht, metropolitan -status , and geographic region. ELSEGIS, Parts 
A, B, and C, are surveys conducted biennial-ly. Collectively, they cover 
staff, finances, public school organization, and pupils. Parts A and B 
give data aggregated at the school district level; for Part C, the unit 
is the individual school. 

Procedures 

The ^data gathered for the Evaluation of Follow Through were., obtained 
through two general procedures: (l) direct assessment in the .schools and 
communities as with the FT test battery, parent interview,^ teacher and 
aide survey, and community studies and (2) the use of nohreactive mea- 
sures and secondary data sources, such as the FT classroom roster, class- 
room observation , and project descriptor d ata sources (e.g. , Census , 
CPIR, ELSEGIS, etc.). Since Follow Through is a very lar^e and complex 
program, correspondingly elaborate, yet systematic and detailed, data 
collection procedures were developed for all direct assessments. Tliese 
included the establishment of : ' 

• Instrumentation and materials development, logistics, and 
^receipt control procedures 

• Data collection training, scheduling, supervision^, and 
management procedures* \' 

• Instrumen't administration, jicoring, ^nd processing procedures 

• Quality control, error resolution, and data storage and 
' retrieval systems. 

Specif ic^data collection prpcedures varied' according to requirements 
of the separate instruments. VPup£U ^ata (roster and test battery) were 
obtained through the SRI field operations procedure. Parent interview 
data were obtained through in-person interview by NORC . Teacher and aide 
questionnaire data were obtained' by mail-survey methods. Classroom 
process data were obtained through the classroom observation procedure. 
Project descriptor data were acquired from secondary sources. C^ommunity 
studies data were o^btained on a case study basis. Each of these proce- 
dures is described in the following paragraphs. , Where appropriate, they 
are discuf=:sed In terms of development, training, administration, and 
processing components . 
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Rostering Procedure 

I Roster data were a basic source of relevant descriptive information 

on pupils and classrooms and thus we^pe essei\tial for the interpretat ion 
of test scores and for the organization and management of all subsequent 
longitudinal data. The initial procedure for collecting roster data in-^ 
volved reliance on site personnel and classroom teachers. Specif ica,l ly , 
rosters were distributed at the end of the school year to classroom and 
school personnel via FT Directors. These local personnel were to complete 
is thp rosters and return them to SRI, also via the FT Directors. Flowever, 
M over the course of the evaluation, procedures employed to gather these 
roster data were revised because feedback indicated that problems were 

I encountered in this initial procedure. This revision involved the follow- 

ing two major changes, which were implemented in the 1970-71 rostering 
procedure : ^ 

,(1) Rostering of all classrooms scheduled for any testing (either 
Fall or Spring^ was completed twice during the, school year; 
■ „ at the time ol fall testing and at the time of spring testing, 

(2 ) Res pons ibil i ty for accuracy and completeness of ros ters was 
assumed by SRI fie,ld staff (described below) who worked with 
local school personnel and teachers in., obtaining and validating 
ros ter d ata » , ; ^ 

These revisions greatly improved the quality and completeness of roster 
information, which was, of course, essential, sincfe rosters provide the 
major basis for linking up longitudinal data; for tracking control school::, 
ciasses, and pupils; for organizing all pupil, classroom, and teacher data 
into appropriate subgroups (e.g., FT/NFT); for defining the parent inter- 
view sample; and for organizing the data storage and retrieval system. • 

FT Testing Procedure 

As described in the instrumentation section, the FT batter}* consisl^ed 
of several grade-specific and several grade-overlapping test instruments.. 
Moreover, some changes in the grade-specific contents of the battery were 
made from Fall; 1969, to Spring, 1971, (the duration of this interim 
evaluation). These modifications generally reflect subtle changes in 
evaluation focus over the interim period as well as instrument refine- 
ment based on item analyses of preceding administrations.' 
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Parallel to the instrument refinement, substantial development and 
modification of trainin^j and administration procedures occurred. Speci- 
'f^cally, the training and test administration procedures for Fall, 1969, 
data collection employed 36 SRI/FT "regional representatives," whose 
responsibility it was to recruit, train, and supervise test administra- 
tors and* coordinate the collection of data from over 35,000 pupils in 
90 projects. (The. Fall, 1969, Cohort I sample). These regional repre- 
sentatives generally were university faculty (most from schools of educa- 
tion) and' were recruited as consultants to SRI. These representatives 
attended a s ingle , central training session conducted by SRI proj cct 
^taf f . The training included testing procedures , test receipt control , 
shipping and documentation procedures, and various scheduling and admin- 
istrative details. Each representative was responsible for hiring his 
own testing staff and for -conducting his own local training session in r 
accordance with guidelines established by SRI. (SRI, 1969). Each regional 
representative was assigned between two and four projects and supervised ^ 
an average of 10 testers. 

Several features ^distinguishing the P'all^ 1969, test effort were 
the use of supervising, assis tant and ^ aide testers at the classroom 
level; attempts at the use of systematic scheduling for- test admiiiistra- 
tion (with pupils assigned to schedules at random); and the use of mul^ti- 
level management procedures to establish uniform supervisipn and control 
over the brief but wide-scale test program* All pupil tests were to be 
administered within a two- week period, hopefully between the sixth and 
eighth week following the commencement of school and again between three 
and six \yeeks prior to completion of the school year. Within these test- 
ing intervals, attempts were made- to test FT "and NFT classrooms simulta- 
neously, alternating testel^s between FT and NFT classrooms at random. 
Regional representatives served as test coordinators, field suparvfsors, 
staff trainers, and. SRI-project liaisons. 

Experience gained in the preliminary data collection activities in 
1968-69 -indicated that these procedures should be adequate, especially since 
the time available in which to develop and implement any procedure follow- 
ing specification of the battery and appro\^al of the sampl^e was extremely 
short. Roct^ll that Cohort I K and EF samples were only a subset of the ^ 
Fall, 1969, sample. Data were also collected on continuing first, second, 
third, and fourth grade pupils at these sites. Also, because of the scope" ' 
and magnitude of the Fall, 1969, effort, and because of delays in negotia- 
ting the content of the test battery , completion of Cohort I baseline test- 
ing often occurred as late as December, 1969, nearly half-way into the 
school year. This -unavoidable delay in the Cohort I base] ine testin^^ has 
unfortunate consequences on the validity and utility of resultant data, 
particularly for those programs designed t-^ .produce -large early impacts. 
For example, if a program had a large effect on improving performance of 
; / ' • 

, ' 49 - 



FT pupils during the interval between commencement of school and b?^ spline 
te,sting, these FT pupils will invalidly appear as "more able" at baseline. 
If this difference is considered or adjusted for in an analysis of FT/NFT 
outcomes, the net effect will be that of "adjusting put" the impact of the 
.program — making the statistical test biased against finding an FT program 
effect. 

Nevertheless, the decentralized nature of this operational plan led * 
to variation in the quality and completeness of data collected. That is, 
consultants differed in their approaclies to training, scheduling, and 
documentation such that occasionally irreplaceable gaps in baseline data 
(both rosters and test battery )' resulted . These data problems were 
largely eliminated in subsequent data collection when it became possible 
to (1) establish permanent centralized data collection managers who were 
called *'field Supervisors," (2) replace "regional representatives" with 
locally recruited ''^site coordinators," and (3) establish uniform regional 
training programs, which were attended by site coordinators and super- 
vising testers. These changes amounted to centralizing data collection 
'management at SRI (as implemented by the field supervisors), and further 
standardizing the training and administration procedures (senior testers 
attended regional training). By the Spring, 1971, testing period, thest: 
procedural revisions were sufficiently refined that the quality and quan- 
tity of relevant data were clearly superior to those obtained in Fall, 1969 

^ ^ Under this improved standardized testing procedure, classroom testing 
Ivas conducted by attest team consisting of a supervising tester, an assis- 
tant, tester, ana two or more aides. Several such teams worked at each 
si.te. The supervising tes ters ' within a testing site -reported to the site 
coordinator, who in turn was responsible to a designated field supervisor. 

Regional training was conducted by SRI training experts and field 'j 
9«ipe?visors . All site coordinators and field supervisors were required 
to demonstrate understanding and competence in the administration of all 
tests and in the completion of all contrpl and data forms. These regional 
trainees, in turn, conducted local training for test assistants and aides 
(project hires) under tlie supervision of the field 6uperviser. 

i 

Testing was accomplished within the second to fourth vv^ek "after 
commencement and bfefore completion of the school year] Again testers < 
were randomly balanced a'cross FT and NFT classes . Upon compile t^-oi^ 
testing,- test bookfet^ and data forms were shipped to SRI for hand pro- 
cessing. The processing of test data into machine readable form was 
accomplished with very high precision by means of two coding verification 
steps (100 percent and variable sample) and through compi'ehensive Editing 
and: resolution jrocedures in the FT data storage, system. Rofster and test 
data were linked in this storage procedure. " ' , 



Parent Interview Procedure 



The administration of the parent interview (PI) survey was subcon- 

o 

tracted to National Opinion Research Center (NORC). However, SRI was 
responsible for developing and ^revising the instrument, for selecting f 
projects and specifying the parents to be interviewed, and* for scheduling 
the survey and follow-up activities. In close collaboration with SRI, 
NORC was responsible for the recruitment, training, and supervision of 
all interviewers. In general, NORC attempted to recruit local (40 sites) 
interviewers through the assistance^, of the FT directors and PAC chairmen. 

The respondent to the PI was the mother or the mother surrogate when 
available; in the .-ma jor i ty of cases, the respondent was the mother. If 
neither the mother nor a mother surrogate was a member of the household , 
the father was sought as respondent; if he too was unavailable, a respon- 
sible adult in the household was interviewed. 

The designation of participants for each PI interview sample was 
based on the corresponding year's pupil test sample. For both the 1970 
and 1971 administrations of the PI, the following cri teria iwere employed: 

(1) At sites where 110 or fewer entering pupils were tested, 
all corresponding households were sche(^.aled for inter- 
views. - ^ 

(2) At sites where the number of pupils^ testevi exceeded 110, 
random selection of 110 households defined the interview 
sample. Thus, up to, but not more than, 110 interviews 
were administered at each site, each year. 

The Follow Through sponsors provided NORC with lists of households 
to contact. In Spring, 1970, a 96.5 percent net response rate was obtained 
from an original assignment of 14,800 cases'. In Spring, 1971, of thel5,17^ 
possible respondents ,. all ' but 397 were interviewed, giving a net response 
rate of 97.5 percent. These approximately 15,000 interviews per year ^ 
ri^present th^ total survey effort. As has been noted . earl ier, relevant 
Cohort I and Cohort II samples are only subsets of these ^ efforts . Never- 
theless, we feel it is reasonable to assume thgit the ,sajne return rates 
prevailed for these subsanples . ,^ h 

^ As a control on quality, NORC checked between 15 percent and 25 per- 
cent oi the respondents by phone or mail to assure that the ^ interview had 
taken place. Except for the first few interviews at each site, interviews 
^were seJLected for validation alT randomr. . In. addition to verifying the 
origiijal , interview and the answers to a few \ey que.'tions, NORC asked 
respondents questions about their reactioix to the interview itself. The 



validation forms were transmitted to SRI. Whenever a. respondent refused 
the,first attempt to interview, another interviewer (often the supervisor) 
recontacted ithc respondent and either obtained the interview or learned 
morp about the reason for the refusal. The overall direct refus.:.l rate 
was \vell under 1 percent. 

o 

\ 

Teacher and Aide Survey Procedure 

> 

Survey data for teachers and aUdes for this interim evaluat ion "are 
based on self-reports of respondents to the Teacher Questionnaire and the 
Aide Questionnaire, These questionnaires; d id not entail'any in-person 
interviewing, nor were any .forced compliance procedures employed. 

During the last two weeks of April, li971, questionnaires were mailed 
to al]. Follow Through ^coordinai^ors, to be distributed to both Follow Through 
and non-Follow Through teachers whose classes had been:tested in the, Fall, 
1970, During the last week of May, an additional 200 questionnaires were 
sent out to Follow Through coordinators for teachers whose classes were 
tested in the Spring but not in the Fall, Each Follow Through teacher 
received one aide questionnaire for a regular clabsroom aide to fill out. 
If morg than one aide worked in the classroom, the teachers were- instructed 
to alphabetize the aides by the iaf/r name and give the questionnaire to 
the second ' one on the list. There" were 1,774 teacher questionnaires and 
933. aide ques tionnaires sent out. Of the teachers, 993 were Follow Through 
and 781 were non-Follow Through, ' 

Of the 1,774 teacher (993 Follow Through and 781 non-Follow Through) 
questionnaires sent out, 1,462 v.ere completed and returned for a response 
rate of 82 percent, and of the 993 aide questionnaires sent out, 804 were 
completed and returned for a response rate of 80 percent. At present, no 
data are available oi) determinants of refusal'. 

Completed questionnaires were processed by a team of three coders, 
one checker, and one supervisor. For the aide questionnaire, two coders, 
one checkc^r, and one supervisor weue used. Each booklet was coded and 
checked for errors and open-ended responses. The supervisor verified 
processing of every fifth que's tionnaire . A two-step verification of key- 
punch.' " vfas' employed — 100 jpercont verif icatio.i- and 5 percent verification. 

Clas sroo m Observatio n Procedure . ' , 

■ ■ . \ 

Observers completed a programmed home study course in ^ which they 
were acquainted with the Classroom Observation Instrument and learned 
the definition^ of the coding symbols. They then attended a four-day 
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intensive trainins session with an instructor for every four trainees* 
This session dealt with standard procedures for the conduct of all 
;ictti*il observations.,. Aft-er passing a final coding ci*iterion test, observ- 
vr^ went to UwL >i r designate^a "^'field^sites. ' 

)n Spring ^» 1971, each classroom in the Classroom Observation sub- 
Honiple was observed for two consecutive days. An Observation Summary 
^Form, ma:nly identifying the- classroom and noting number of children, 
wn?^ filled in once* The Classroom Check List and the Five-Minute Ob- 
servation of interactions were completed at the rate of approximately 
four pei* hour, or appro?!Cir.ately 16 per day (fewer for most kindergartens, 
tvhlch wore half-day sessions). One completed CCL and FMO constituted a 
unit called Classroom Observation Period (COP). Observers were instructed 
lo includc;one of each Hind of adult/child grouping and one or:'each major 
type of activity as the focus of their -five-minute observations. A COP 
consl^ituted the unit on which the frequency' of the process variable was 

based . ^ ' 

- . ~- ■ ' * ; 

Reliability chejrks were made by on-site ^'simultaneous coding by ob- 
server and SRI Classroom Observation trainer for two hours in the sam^ 
classroom (eight flve-^minule observations). 

Observations were conducted by the same observers in both Follow 
Through i^nd non-Follow Through classrooms at the same site. 

Project Descriptor Procedures 

For the project descariptors study approxitnatcly 1,500 individual 
pieces c^*^ data were collected for each of the 45 projects in the sample. 
In nil, more than 75 1 000 pieces of data were collected and processed. 
General ly» data collection involved a he§ivlly concontratedyef fort be- 
cause of the time limitations. Data Sources are led inate<f below, along 
^wl44%-H3Sti mates of how much of the data we actually uuve been able to 
coJl'<?ct from each source. ' - — _ 

» 

The high return rates indicated above may be nmeviiat misleading for 
two reasons* FlVst, the percentages givqn above represent the proportion 
of the total sample CK s: 45) for which we have data* For some sources 
some ci the sample projects were not included in the original data gather- 
,lfigy , Hcnco for the ELSSGIS data a 100 percent completion rate is unob- 
taiti^bl^ bocauso one of our projects was not surveyed by ELSEGIS. On the 
other tAnd, in a significant number of cases, portions of the information 
dbntaiood in a given source are either irccmplete or apparently inaccurate. 
Lack of complotenesg has been a particularly dif f ^Lcult problem .wi^h the 
project applications. Copies of individual project applications «bre 



requested directly from the 45 sample projects. These applications were 
r6ceived over a period of several months, extending ^into Spring, 1972, 
well into the data prdcessing stage of our study. 

• \ f . ^ ^ ^ . ' " 

c Percentage of Projects for Which, 

Secondary Source . . ^ Source has been Obtained 

Census ' . ' > 100% ^ 

NCES and State Educdtion Agency 

publications , . ^ 75 

CPIR data 91 

ELSEGIS : - ^ * 

' Part A~Slaf f ^ - 95 

Part B — Finances ' - 97 *^ , ' 

Part C — Indivi'dual schools • ^ 90 . . " * 

Local Fellow Through project (annualK 

applications to USOE 100 

SRI Follow Through Data Bank . " . o 

Child Roster data. • ' . 10.0 

■ Teacher/Aide Questionnaif-e data , 100 

Parent Interview data *» ^ -^64 

Personnel Roster* data 84 " 

Because of the pressing need to begin the datd processing stage of 
our investigation, the codes had to be ^formulated before a fully repre- 
sentative set of propoi^als^had been made available to SRI. Thus the 
codes, although they do provide for a wide variety of responses, do not t 
cover evary contingency ^^nd are more suitable for some projects than for 
others. For example ^ proposals from Ne^w York City and Philadelphia were 
not received until after the codes had been constructed. Certain distinc- 
tive features of both- sets of projects could only be approximated" by trie, 
codes available* This phase of our study has produced codebooks for the 
various source documents and an elaborate set of coding instructions for 
the project applications, ^ 

Given these constraintSi the •reliability and validity of the resul- 
tant data I although high, could be improved. Attempts were made to im- 
prove the accuracy of project data through phone calls to the projects, 
examination of the project master files maintained at the U.S. Office of 
Education I Follow Through Braheh| and discussion with the SRI field staff. 
AlsO| among the techniques used to assess or improve the reliau' .. Uy of 
descriptor data were: 

■■ . ' • 

(1) Ijse of multiple spurces to check for Inter'-spurce consistency 

(2) ^Estimation of Internal consistency within a given source 
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(3) Checks for reasonable consistency across time when appropriate 
data were available. 

r 

" Community Studies, 1970-71 

.Community studies conducted during the evaluation year I9t0-71'are 
part of a series of developmental studies 'aimed at identifying the pat- 
terns of institutional change — both those taking place within the .school 
and those involving relationships ..between the school and the larger com- 
munity — emerging within Follow Through . " * - 

c 

The 1970t71 phase ef the studies waa designed to fte&t the viability 

of ia> tentative model of Institutional change that had been developed- fr'om 

case studies conducted bet^yeen 1968 and 1970* This model suggested, that 

the^^l^t^el of parental involvement in t|ie locals projects was a critical s 

factor in the process of institutional change. It further identifi^ed 

certain determinants of the leVel of parent involvement in Follow Through. 

These factors include orfjanizational characteristics of the project and 

its Parent Advisory Committee (PAC), social-psychorog.ical attributes of 

t' ' ■ ' . • - 

the parents (role^expectatipns, conflict, etc.),^ and resources. 

p:he ^970-71 effort was conceived as a prelimirfary investigation into 
the re!rati,onship between certain ^psychosocial and organizational ^variables 
and the level of parental involvement in Follqw Thrpugh projects. Nine 
projects were purposively selected to comprise the sample for ^he explora- 
>tory study. . Because of the satnpleV sn^all size and Judgmental nature, no 
generalizations-*;especially concrlusions regarding difierences among sponsor 
approaches y projects, or geographic regions-*-are warranted, ^ 

The study consisted of two parts. ^ For the part concerned with the 
relationship between organizational aspects of individual projects and 
the overall level of parental participation, unobtrusive techniques were 
utilized to gather information on individual projects. Annual project 
applications and interim and progress reports were obtained and scruti- 
nized, USOE/Follow Through master files, contaii^ing reports on individual 
projects prepared by General Consultants and Specialists, were consulted. 
Demographic and population data describing the ^cnmnunity and the Local 
Education Agency were obtained from sources at the Bureau of the Census 
and at National Center for Educational Statistics (particularly EISEGIS 
and CPIR), respectively* ^ 

Both the qualitative and quantitive data obtained from these secon-* 
dary sources wore used to develop a descriptive profile for each projec't, 
uliich focused mainly on the school y^^r 1970*7^ but did not necessarily 
exclude evolutionary atxl developmental factors extending back to prior 
years* The profiles were reviewed by project directors, PAC chairmen at 
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the respective projects, and project officers. This revie^v wtts undertaken 
not only to establish the factual accuracy' of the dala but also, and 
equally importantly, to determine whether the "gestalt" of p sxiven' pro- 
file coincided with the views of persons familiar with that project. 
I'he profiles were then synthesized and analyzed, 

p'or the phase of the s tudy concerned with social-psychologi cal 
factors underlying parental involvement, semi-structured interviews 
were employed. Respondents were purposively selected sajnples of parents 
and teachers in the nine prdjects. Although general guides were used in 
conducting the interviews, respondents were allowed and even encouraged 
to narrate their perceptions of their own and others' roles and role 
relationships. No attempt was- made to prevent respondents from offering- 
personal, subjective descriptions oi causal patterns that they believed 
linked various role perf oiTuances to variations in mode of involvement or 
to variations of activities. 

This technique was selected with the aim of ascertaining a range 
C'f expect^l^ions , attitudes^ norms, and patterns of behavior for parents 
ajod teachers . Althou^ch no two ^respondents received exactly the same . 
iriterview treatment, questions pertaining to expectations and behaviors 
were structured and similar for all respondents. Follow Through parents 
(including PAC chairmen) and teachers were asked about the attributes, 
broadly defined", that they perceived as most essential for determining 
legitimacy and probability of their participation in the program. 

Since these studies were conducted on only 7 of the projects included 
in this evaluation, their results are not included in this report. The 
interested reader is referred to the Draft report (SRI, 1972b, Appendix E) 
arid to a iorthcoming revision scheduled for publication in Spring 1973. 

Defi nition and D evelov^ ment of Evaluation Variables » / 

^ Before we describe the procedure for defining and developing evalua- 
tion variables from instruments and data collected, it mi^at be useful 
to review briefly tY3 general model for evaluating the overall FT program 
and its planned variations. In its simplest form, the model assesses the ^ ^ 
differences between participants and nonparticipants in Follow Through on 
measures and variables of interest. These measures or variables are 
referred to as '^outcomes," and differences in outcomes are defined as 
"impacts," Before any such assessed differences may be attributed to 
program participatior. (i.e., before impact may be ascribed to the program), 
it is necessary to establish that participants and nonparticipants were 
comparable (i.e., establish some degree of ceteris paribus) before imple- 
mentation of the program. These measures of comparability are defined- 
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as control variables or ''inputs" in \he evaluation iiiodel . Finally , alter 
comparability of inputs has been established (i.e., input differences have 
been controlled), impacts, can bfe in^terpreted in terms of a third class of 
variables — namely, process or treatment variables and components. Thus 
there are three classes of variables in our evaluation model: (1) input 
or control i (2) process or treatment, and (3) impact or outcome. ^ 

When wo apply this model to the Follow Through experiment, we see 
that it applies at three j^nterrelated levels--or impact foci--each .roughly 
corresponding to a general objective hold for the Follow Through program = 
by at least one stakeholder group. The first level — and the one cehti^al 
to this report — is the child.' What are the overall and differential im- 
pacts of Follow Through on che educational, psychological, and social 
growth and development of children? The second level — obviously inter- 
related with the first — is the parent/home and community. Wliat are the 
relative ov.-^rall and differential impacts of Follow Through on parent, 
parent-chile^, parent-school, and, parent-communi ty fac tors? The third 
level— also interrelated with the first and second — is th^ 7:eachor, class- 
room, and school. What, for exainple, are the overall and relative impacts 
of Follow Through on teacher attitudes and behavior; classroom practices; 
curriculum reforms; and school funding, staffing and service policies? 

For each of*the' above levels, the evaluation, question was pcsed in 
two forms: what, if any; are the general or overall impacts ;' and what, 
if any, are the differential (i.e., input or process specific) impacts 
of Follow Through?, The first form of the question is designed to assess 
and evaluate Follow Through as a national program. The second form is 
g.ppropriate for assessing and evaluating planned variations of Follow 
Through. We believe that limitations in the scope, representativeness, 
and duration of "treatments" of the programs included in tliis interim 
report, r\s, well as a number of data and analysis projjlems discussed below, 
severely restrict answers to such evaluation questions at this time. 
Nevertheless, the soundness^ of this evaluation iipproach should be evident 
and, as applied to these interim data, could reveal emerging program 
effects. ^ 

Basic Variable by Category Matrix for Evaluation Data 

Table 6 ];^resents a summary of the organization ol data sources and 
instruments into appropriate eva.luation variables in terms of the basic 
evaluatdon model. This table shows the separation as well as the overlap 
of the various instruments in terms of how each, contributed to evaluation 
variables. In the following paragraphs, we describe ouv method and pro- 
cedure for organizing our data into appropriate evaluation varrables and 
refor to relevant operational definitions . . 



'57 



TABLE 6 



/ 



OkGAiaZATTON OF DATA SOURCES INTO CATEGORIES OF EVALUATION 
VARIABLES FOR ASSECS^TKNT OF INIliRIM EFFECTS 



CONTROL ftlEASURES 



OUTCOME MEASURES 



DESCRIPTIVE PROCESS 
ME.^\SURES 



CHILD EVALUATION DATA 
ROSTER 

BASELINE TESIB 
PARENT ' INTERVIEW 



ROSTER (ATTENDANCE) 
POSTTESTS 



PROJECT DE SCRIPTS 
CLASSROOM OBSERVATION 
PROCEDURE 



PARENT EVALUATION DATA 



PARENT interview; 



PARENT INTERVIEW 



(COMMUNITY STUDIES) 



TEACHER EVALUATION DATA 



TEACHER/AIDE 
QUESTIONNAIRE 



TEACHER/AIDE 
QUESTIONNAIRE 



Variables at the Child Level * 

The principal source of variables at the .child level , of evaluation 
was the /Follow Through test battery, although the roster -and the project 
descriptors also contributed data. 

Outcomes--The FT battf, y provided measures of const: -icts wdthin two 
domains — cognitive and noiicognitive . Cogn"? '. = ve measur'es were both heter- 
ogeneous'and ^ aried from year to year, wh -.gnitive measures were 

obtained from the single, relatively homogeneous i aces Attitude Inventory. 
Correspondingly, the pupil cognitive variables—e . g . , "achievement"— are 
represented at several levels of specificity or aggregation , whereas, a 
single noncognitive or "attitude" variable is employed. These achievement 
constructs are as follows : 

(1) Total achievement—defined as the raw score sum of all cor- 
rect responses on all cognitive. test items, 

(2) WRAT achievement — defined as the raw score sum of correct 
responses to the WRAT test. 

. . .58, 



Quantitative skills--def ined as the raw score sum of correct 
items pertaining to quantitative concepts — such as numeration, 
operations (addition, subtraction, etc.), and word problems. 

Reading skills — defined ^s the raw score sum of correct re- 
sponses to items requiring reading or . reading-rela,ted skills 
(including pre-reading), such skills as alphabet/letter 
recognition , matching and copying , f igure copying , word 
tr.hin^ , symbol matching , oddity discrimination . 

Language Arts--de fined as the raw score sum of correct 
responses to items requiring language, lexicographic, or 
grammatical skills- such as analogies, word meaning, spelling, 
ahd concept activation. 

Cognitive Processes — a residual category cons is ting of the 
raw score sum of correct responses to items requiring per- 
ceptual motor skills and concept identifications. 

It should be noted that in this organization of instruments into 
variables, the quantitative, reading, language, and cognitive process 
variabl 3 consist of mutually exclusive subsets of the total achievement 
variable. The \VRAT achievement variable, on the other hand, .overlaps 
with both the total achievement variable and the quantitative, reading, 
and language variables. A reliability analysis of these outcome measures 
is presented in Annex B of this report. 

■ (f ■ . . 

The attitude measure initially comprised two components : psycholog- 
ical — or feelings toward self, others, school and learning — and socio- 
logical — or perceived peer acceptance, social distance, and associated 
sociograms . These latter ineasures were abandoned because of lack of 
reliability and validity,, yielding a single measure for the noncogni tive- 
or attitude — variable. This measure is defined as the scaled sum of self 
reports u.^ •^'eelings (sad = 1, so-so = 2, happy = 3) toward a series of 
standard situations. . , ^ 

Finally, a nonreactive measure of interest was obtained from the 
classroom roster -for use as a child outcome. Thir measure, attendance, 
was sir >'y the number of days absent in the preceding academic year^up 
to t'h^ time. of rostering. 

Controls --A preliminary inspection of test, roster, and descriptor 
data indicated that although the FT and comparison (NFT) school pupils 
were similar in many — if not most — respects, important initial differ- 
ences existed, particularly on measures assumed (or demonstrated) to be 



(3) 
(4) 

(5) 
(6) 
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related to outcomes, such as baseline test scores. Consequently, three 
categories of control (input) variables were delined and subsequently 
utilized to generate "comparability", in the analysis lor program effects. 
The^e categories of control variables were"^: 

(1) Baseline differences — defined a^ the entering scores on 
each of the above defined outcome variables (i.e. , acnieve- 
mcnt, \VRAT, quantitative, language, reading, cognitive 
process , and affect ) . 

(2) Environmental differences — def.ined as home, school, and 
comihunity demographics and SES factors, such as -parents' 
education, occupation, race, income, sex and employment 
of head of household ' (HH ) , urbanization of the conmnniity, 
the pupil /teacher ratio and average per pupil expenditures 
of the school. So that these measures could be aggregated 
within an analysis, education was dichotomized at high, school 
diploma , race as Black versus non-Black , incomp as poverty 
eligible versus noneligible, - and employment, of household 
Uead (HH) as working versus seeking employment. 

; (3) Individual/experiential differences- — defined as average 
pupil age, sex, race, and preschool experience. Pupil 
"ago was calculated in months at the time of baseline 
testing. ■ Se.: is representee as the percent male pupils,* 
race as the percent black, l.mguage as the percent of 
' pupil o for whom English is the first language, and pre- 
^ school is either (a) actual jiionths of Head Start or eqjai- 
y valent preschool experi ence , or (b ) percent of pupils' 

having some Head Start or equivalent education. 

Variables^ at the Parent Level 

Parenx involvement in their child's learning and parent participa- 
tion in school activities (at least at the level of awareness of or 
knowledge about t}\3 school program) are considered important or even 
crucial in the various Follow Through models. In some, such as the 
Parent Implemented nodels, parents are as much or more the objects. ol 
program influence as their children, dther models (e,g,, the Florida 
Parent Education Model) seek to influence child development through the 
m(^iation of parent behavior and direct tneir efforts to parent education 



*See the explanation of the OEO poverty guidelines, for interpretat ioh^:)f 
eligibility on page 84 of this report, 
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Regardless of differences in emphasis and foetus, all Follow Through 
approaches attribute substantial importance to parent attitudes and 
behavior. Therefore, several evaluation variables at the parent level 
were defined and incorporated in^the impact analyses. The principal 
source of these parent level impdct variables was the parent Interview 
Instrument. 

Measures of parents* attitudes and behaviors were obtained from 
selected judgmental grouping.^ of cri terion. items on the PI schedule. 
Responses to each such item were tabulated ancj distributed, and the 
resultant distributions were' each dichotomized at ..their respective 
medians. The rescaled item scores (1,0) were then aggregated such 
that the variable scores were the simple sum of item scores. Finally," 
'thes^ sums were converted to a standardized distribution through use of 
the formula ' " > ^ 



X - M 
_k K 



V, 



wheie X is the aggregated item scores, M is the mean of the aggregated 
scores, and SD is the standard deviation. These standardizations were 
performed separately tor FT/NFT by variable, cohort , and grade stream. 

The specific parent impact variables used in this interim evalua- 
tion are the following: 

(1 ) Parent-child » interactions , or the extent to which parents . 
report they abtively interact with 1;heir children in Such 
activities as talking with., their children', taking their 
chi'ldren on trips, helping their children with school work, 
reading to them, accepting assistance from then, and 
acknowledging xheir progress in school. 

(2 ) Parent- school involvement or the ex ten t to wh ich par en ts 
report they are actively participating in various school- 
related act ivi ties , such as classroom . visits, ^-volunteer 
assistance , parent/school ^meetings , and external contacts 
with school personnel. 
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(3) Child acpc!.?»iiic expectations', or the extent to' which the 
parent r'^ports satisfaction with the child's progress and 
optimism i^^ejarding the child's future, both academic and non- , 
academic. (E.g, what are the child's expected grades, chances 
of. getting a good job, chances of going on to college?) 

(4) Sense of control, or the extent to which the parent 
i^epor ts a sense of concern and control over school proce- 
dures, educational reforms and school awareness of and 
responsiveness TOAparent and community desires and needs. 

Other parent measures that we^e similarly scaled but that were not 
included in this interim analysis (generally because of an excessive 
amount of missing data or inadequate response variability) were: - 

' (5) Parent-social interactions, or the extent to which parents 
actively interacted )vitli otliei:% in the communi ty (e . g. , 
visit with friends, visited by friends, club memberships, 
etc.). 

(6) Parent locus of control, or the extent the parent reports 
a senise of contro-l over his/her late and life chances, as 
well as those of others. » 

(7) Parent participation in PAC , or the extent to which the 
parent reports awareness of and/or. participation i?. the 
Policy Advisory Committee (note : variable has meaning only 
for FT parents). 

Since .the s^ame in^trapro ject variability . (FT/NFT) noted for child 
outcome measures would be present ,and potentially bias estimates of 
program impact on parent outcomes,' it was necessary to develop and em- 
ploy a set of-variables to control for initial or unintended differences 
In the parent sample. Many of these control variab''.es are identical to 
those developed for child level analyses to control for effects of en- 
vironmental differences. ^ ' * , • 

. Variables at Teacher Level 

Outcomes — To assess the relative impact^ of Follow Through on teacher 
attitudes, and behavior, information obtained from teacher and aide re- 
sponses to the questionnaires administered in 1969 and 1970 was assem- 
bled into outcome measures of interest. The procedure adopted for 
developing' these measures was essentially the same scaling method as that 
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used to develop parent outcomd and control measures. Items were asseillbled 
into logical and Judgmental groupings, responses were rescaled after 
dichotomizing the distribution for each item at or. nearest the' niodian; 
the rescaled item scores were then aggregated into a groupint; or variable> 
score. These scaled variable scores were finally transformed into a 
standardized metric with Mean =0, SD = 1. ' 

i 

The specific teacher outcome variables prepared for analysis of 
program impact are as follows : 

(1) Parent-educator image, or the extent, to which teachers 
reported \hey felt it essential to "get together with 
pa rents outside of the class room** f or purposes o T 

• Improving children's learning 

• Improving classroom teaching. 

• Learningparents* view son teaching, « 

• improving school services to parents 

• I.'.iprovmg school services to children 

• Improving school services to coii\iT\|mity 

• parental understanding of school program. 

The logic of this measure or index of teacher^ attitude 
toward parents is that ivf parents are being brought more 

into the mains'tream of their children's education through 

the FT programs and if . their participation is being viewed -> 
^s he-pful by teachers,* these effects should be reflected 

in differences between FT and NFT teachers on s i^es for 

the above variable . 

(2) Professional acceptance of method , or the extent to which 
th e teacher reports she would not prefer to adopt some 
teaching approach other than the one she is currently 
using'^. The logic for., use of thi s variable as a. measure of 
impact is that if teachers perceive their current method 
as appropriate and ef f ective^ to the ne^:fds of the chi Idren, 
they should express less interest in c:i]ternative methods. 
Essentially, this measure provides an indirect assessment 
of teacher satisfaction with her current approach. 
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Controls-- Again, ais* with parent and c^ild var^bles, attention was 
given to establishing compArabillty of teachers on A^riables not defined 
or considered as impofcct foci, but which were believed to be or demon- 
strated to be related to (or concomitant with) such impacts. Two sources 
of such "control variables*' data were the Asacher questionnaire and 'proj- 
ect descriptor instruments described in a preceding section, 

^ Two general categories of teacher control variables were definftj and 
generated t ^ " 

(1) Resource ^ or the general staffing, expenditure, facility 

. t and related resource patterns of the school. Variables ' 
comprising this category of controls were: 

• Urbanization of the school district . 

. • Average district expenditure per pupil 

• Average dis,trict^pupil/teacher ratio 

• Number of helpers — aides, paraprof essionals, 
parents — ^made available to the teacher 

• The book and library facilities available to pupils 
in the class.' * 

(2) Background/Experience , or the ethnicity, attitude and 
e:cperience of the teacher. Variables comprising this 
^category bf controls were: ' 

• Teacher race (Black vs non-Black) 

4 

• Satisfaction with working conditions, or the average? 
scaled response (v^ry satisfied to dissatisfied) 
indicating teacher satisfaction with" working conditions 

' f . in her classroom on such specifics ^s equipment, supplies, 

- space, class schedule, salary, and planning time* . 

^ * ■> * 

• Coiranunity closenesgLj or wiiethej; or not the teacher is a 
^ resident (new versus long time) of the^ dominant pupil 

^^^community 

• Teaching cfhoice, or the extent to which the teacher 
actively sought her current school cl,dsa^bm 
assignment (teacher's currAt school and classroom 
assignment resulted from her request, 'from an admitristra-, 

/ • tiye request, or from an administrative assignm9nt) 

• Experience, or the sum of the above median responses on 
training and experience questionnaire items, such as 
highest grade level attained, degree attained, social 
science courses taken, type of certificate held, tenure, 
years and level of full time teaching • 
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Project Variables . ^ » 

Tlie project descriptor procedure was essential- to the development of 

measures thdt usefully, characterize the sites in which ejects were eval 

uated.' Of the many such descriptors developed and analyz^ .d, X^RI, 1972b, 

/ ^ / I— — 

Appendix D) we selected t^he- following set as most useful in describing • 

the projects in this evaluation: • ' - 

(1) Region — geopfraphic location of project (see Figuige'S) . 

Northeast / * " ,i . 

New England ' • 

• Middle^^Atlantic 

/ . - ■ ' 

North Oentral • 

East/ North Central 

We'sif North Central ■ ■ ^. 

- f * . • 

South V . ^ 

^ South Atlantic ' ^ < ' 

• East South Central . ^ ' ^ * 
JVest Soijth Ceiytral o . ~ " , 

West € - I' / 

Mountain ' . . 

Pacific 

(2) Urbanism — distance? "to nearest Standard Metropolitan S'^tatis tical 
Area (SMfiA) " _ . > 

• Within^MSA ' ' " 
* '"20 miles from nearest SMSA 

• 30 to 40 miles from nearest SMSA , . 

• 50 to 70'' miles from nearest SMSA 

• 75-120 mi-les from nearest SMSA ' c 

* ■ • 

(3) Size of nearest SMSA — 1970 Census ptjpulation of SMSA in 
whkchr^thS project resides or is proximate , ^ ^ 

* (4> Percent nonwhite-Tpercei^tage of community population that 
consists of minority group ciembers 

- (5) ^-Project Size — number of pupils participating in project 

(6) Average number of* pupils per PAC member in the project 
(project enrollment -r size of PAC) 

(7) Follow Through Per ^upil Cost-TFollow Tlirough expenditure ^ 
per child per year^ in dollars, 
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On the basis of discussions with sponsors regarding the elements 
considered imF>ortant to their models, variables were defined on the * 
activity > grouping, and interaction codes of the Classroom Observation 
Instrument. 

Variables defined on the Classroom Checklist 'included classroom '■■"'^ 
grouping arrangements (e.g., adult with smal> group, child alone) and 
activities engaged in (e.g., block play, arithmetic/numbers). For ex- 
ample, a varia;)le such as "wide range of activities ' was designed to 
capture the simultaneous occurrence of many activities in the classroom, 
which would be expected to be more frequent in **cpen classroom" models 
than in more highly structured, academic models. 

Variables defined on the Five Minute Obs^S^atA,»n sometimes requirec 
information from one code or one frame (e.g., Adult asks child a thougiit 
provoking question*^) and sometimes required information from a sequence 

of frames (e.g., "Adult question child response adult praise'*). 

Forty-one variables were created. 

Since some information reduction appeare*? necessary, several general 
dimensions on which educational models would be expected to differ were 
hypothesized, such as child-directed versus teacher-directed and structured 
versus open. 

The 41 variables were then arranged in a correlation matrix and sub- 
jected to a principal components analysis. Two varimax rotations were 
then performed on the first five and ten factors resulting from the 
analysis. The ten-factor rotation accounted for 70 percent of the matrix 
variance, but also resulted in several factors that were uninterpretable , 
The five-factor rotation accounted for 57 percent of the matrix variance 
and yielded readily identifiable factors with a logical structure. The 
fabtDrs were named by assigning each variable to the factor on which it 
had its highest loading. If a variable loaded highly and nearly equally 
on more than one factor, it was carried on each. This multiple loading 
usually occurred when a variable loaded positively on one factor and 
negatively on another. 

Tables 7 through 11 present the variables and loadings contributing 
to each of the five resultant factors. These five factors are named 
self-regulatory, child-initiated interactions, programmed/acad^^mic, 
expressive, and child self -learning. For each classroom in the observa- 
tion subsample,^ scores were computed on the five factors, and a profile 
of the classroom was created. , 

67 



TABLE 7 
FACTOR 1, SELF-REGULATORY 



VARIABLE DESCRIPTION LOADING 

INDEPENDENT CHILD ACTIVITIES +.87 

WIDE VARIETY OF ACTIVITIES +,87 

CHILD SELF-EXPRESSION +,72 

CHILD INFORMING SELF WltH OBJECTS • , +.60 

G— BLOCKS, TRUCKS, DOLLS, DRESS-UP +,58 

F--ARTS, CRAFTS, SEWING, COOKING, POUNDING, SAWING +,46 

ADULT WITH ONE OR TWO CHILDREN IN ALL ACTIVITIES +.45 

ADULT NEGATIVE AFFECT . -.38 

ADULT ASKING CHILD DIRECT QUESTION -.37 

C~ARITHMETIC, NUMBERS, MATHEMATICS, -.36 
ALPHABET, READING, LANGUAGE DEVELOPMENT 

F— GAMES, PUZZLES +.35 

DIRECT QUESTION FOLLOWED BY CHILD RESPONSE -.35 

CHILD QUESTIONING ADULT +.34 



TOGETHER THESE VARIABLES INDICATE A PATTERN OF CHILDREN 
WORKING BY THEMSELVES AND IN CONTROL OF WHAT THEY ARE DOING 
AT THE MOMENT. 
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TABLE. 8 

FACTOR 2, CHILD-INITIATED INTERACTIONS 



VARIABLE DESCRIPTION LOADING 

ADULT INFORMING CHILD -.74 

ADULT WITHOUT CHILDREN +.66 

ADULT INFORMING CHILD SYMBOLICALLY -.64 

ADULT ACKNOWLEDGMENT TO CHILD +.62 

ADULT POSITIVE CORRECTIVE FEEDBACK +.62 

ADULT COMMUNICATION TOCUS—ONE OR TWO CHILDREN +.61 

ADULlj NEGATIVE CORRECTIVE FEEDBACK ' +.57 

rCHILD QUESTIONING ADULT +.52 

ALL NEGATIVE AFFECT ^-.4] 

ADULT NEGATIVE AFFECT +.37 

ADULT COMMUNICATION FOCUS—LARGE GROUP -.36 

CHILD RESPONSE FOLLOWED BY ADULT FEEDBACK +.35 

F~ARTS, CRAFTS, SEWING, COOKING, POUNDING, SAWING -.32 

CHILD NEGATIVE AFFECT +.30 



THE POSITIVE VARIABLES ON THIS FACTOR REPRESENT CHILDREN 
ASKING A QUESTION THAT, INITIATES AN INTERACTION WITH A LONE 
ADULT. THE ADULT THEN PROVIDES SOME FORM OF FEEDBACK. THE 
NEGATIVE LOADINGS SHOW ADULTS INITIATING AN INTERACTION BY 
INFORMING LAJIGE GROUTS. THIS IMPLIES THAT THE FACTOR STATE- 
MENT HAS ADULTS ALONE, CHILDREN INITIATING, AND ADULTS 
RESPONDING. 
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TABLE 9 

FACTOR 3, PROGRAMMED/ACALhMIC 



VARIABLE DESCRIPTION ' LOy\DING 

ADULT WITH SMALL GROUPS IN ACADEMIC ACTIVITIES +.86 

C — ARITHMETIC, N^IBERS, MATHEMATICS, READING, +,75 
ALPHABET, LANGUAGE DEVELOPMENT 

AIDE PARTICIPATING IN ACADEMIC ACTIVITIES +.75 

B~GROUP TIME, STORY, SINGING, DANCING -.71 

ADULT COMMUNICATION FOCUS—SMALL GROUP +,70 

ACADEMIC ACTIVITIES +,66 

ADULT PRAISE +.64 

CHILD RESPONSE FOLLOWED BY ADULT FEEDBACK +,57 

APULT ASKING CHILD A DIRECT QUESTION +.55 

DIRECT QUESTION FOLLOWED BY CHILD RESPONSE +.52 

ADULT COMMUNICATION FOCUS — LARGE GROUP - -.52 

A~LUNCH, SNAgK -.48 

ADULT WITHOUT CHILDREN +.46 

ADULT COMMUNICATION FOCUS—ONE OR TWO CHILDREN +.40 

ADULT NEGATIVE AFFECT -.37 

ADULT NEGATIVE CORRJ^:CTIVE FEEDBACK -.36 

CHILD SELF-EXPRESSION -.35 
0 



FACTOR 3 DESCRIBES A SCENE WHEREIN ADULTS "tEaCH" READING, 
MATHEMATICS, OR SOCIAL STUDIES USUALLY WITH SMALL GROUPS USING 
POSITn/E REINFORCEMENT FOR RESPONSES TO DIRECT QUESTIONS. 
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TABLE 10 
FACTOR 4, EXPRESSIVE 



VARIABLE DESCRIPTION LOADING 



CHILD 


TO ADULT POSITIVE AFFECT 




.81 


ALL NEGATIVE AFFECT 




. 7C 


CHILD 


NEGATIVE AFFECT 




,69 


CHILD 


POSITIVE AFFECT 




.65 


ADULT 


TO CHILD POSITIVE AFFECT 




,63 


ALL POSITIVE AFFECT 




.61 


DIRECT QUESTION FOLLOWED BY CHILD RESPONSE 




.57 


ADULT 


ASKING CHILD A DIRECT QUESTION 




. 57 


ADULT 


ASKING CHILD THOUGHT PROVOKING QUESTIONS 


-f- 


.49 


CHILD 


INFORMING ANOTHER CHILD 




.47 


ADULT 


NEGATIVE AFFECT 




.41 


CHIID 


RESPONSE FOLLOWED BY ADULT FEEDBACK 




.37 


ADULT 


INFORMING CHILD WITH CONCRETE OBJECTS 




.34 


CHILD 


QUESTIONING ADULT 




,32 



FACTOR 4 IS CALLED "EXPRESSIVE*' AND REFLECTS BOTH POSITIVE 
AND NEGATIVE OVERT EXPRESSIVENESS ON THE PART OF ADULTS .\ND 
CHILDREN. EXPRESSIVENESS IS NEGATIVELY RELATED TO AN ADULT'S 
..DIRECT QUESTIONING AND CHILDREN RESPONDING TO ADULT *S DIRECT 
QUESTIONS, 
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TABLE 11 



FACTOR 5, CHILD SELF-LEARNING 



VARIABLE DESCRIPTION 



LOADING 



ALL CHILD SELF-LFARNING 



+ .86 



CHILD INFORMING SELF SYMBOLICALLY 



+ .81 



ADULT COmiUNICATION FOCUS — SMALL GROUP | 



-.41 



CHILD RESPONSE FOLLOWED BY ADULT FEEDBACK 



-.39 



ADULT PRATSE 



-.38 



ADULT WITHOUT CHILDREN 



- .34 



FACTOR 5 IS SIMPLY CALLED CHILD SELF-LEARNING AFTER ITs' 
TWO POSITIVELY LOADED VARIABLES, AN EXAMPLE OF A CHILD IN- 
FORMING HIMSELF SYMBOLICALLY IS A CHILD READING OR WRITING 
BY HIMSELF. 



Summary of Variables by Level of Evaluation Focus 



Table 12 presents a suimnary of these variables ""within the context 
of the evaluation model for each separate evaluation focus. This summary 
displays the integration of data in terms of evaluation variables and 
wil? serve as a model for presentation and discussion of results in the 
sections to follow. 
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TABLE 12 



BASIC EVALUATION VARIABLES 



INPUT Pm) CONTROL VARIABLES 



OUTCOME VARIABLES 



PART I CHILD EVALUATION DATA 



PROJECT AND PROCESS VARIABLES 



NO. OF CLASSROOMS 

AVERAGE PUPILS/CLASSROOM 

QUANT. PRESCORE 

COG. PROCESS PRESCORE 

READING PRESCORE 

LANGUAGE PRESCORE 

AFFECT PRESCORE 

AGE (JUNE* 71) 

% CLASSROOM MALE 

% CLASSROOM BLACK 

% PRESCHOOL (OR NO. MOS . ) 

% PARENTS W/0 HS DIPL. 

% PARENTS W SKILLED OCCUP. 

% PARENTS BLACK 

% PARENTS POVERTY Ex^IGIBLE 

% HEAD HOUSEHOLD EMPLOYED 

% HEAD FOUSEH'^LD MAUI 



OVERALL ACHIEVEMENT 
AFFEtT 
ATTENDANCE 
V/RAT TOTAL 
QUANTITATIVE SKILL 
COGNITIVE PROCESSES 
READING SKILI5 
LANGUAGE ARTS 



PROJECT DESCRIPTORS 
REGION 

DISTANCE TO NEAREST SMSA 
SIZE OF NEAREST Si\lSA 
PERCENT NONWHITE 
PROJECT SIZE (PUPII5) 
NO. PUPILS/ PAC MEMBER 
FT PER-PUPIL EXPENDITURE 



PART II PARENT EVALUATION DATA 



NO. CLASSROOM GROUPS 

AV. PARENTS/CLASSROOM GRP ' 

% W/0 HIGH SCHOOL DIPLOMA 

% W SKILLED OCCUP. 

% POS EVAL OF CHILD LRNG 

% BLACK 

% REPORTING USE OF PRESCHOOL 
% POVERTY ELIGIBIE 
% HEAD HOUSEHOLD EMPLOYED 
% HEAD HOUSEHOLD MALE 



PARENT-CHILD INTERACTION 
PARENT-SCHOOL INVOLVEMENT 
CHILD-ACADEMIC EXPECTATION 
SEN'SE OF CONTROL 



PART III TEACHER EVALUATION DATA 



NO. OF CLASSROOMS 

JOB SATISF. RATING 

BOOK RESOURCE SCALE 

RACE (BLACK/NONBLACK) 

IDENT. W, COMMUNITY 

NO. OF HELPERS 

ABLE TO CHOOSE ASSIGNMENT 

TRAINING AND TEACHER EXPERIENCE 



PARENT-EDUCATOR IMAGE 



PROFESS lONA} 
OF METHOD 



ACCEPTANCE 



CLASSROOM OBSERVATION 
FACTOR SCORES 

SELF REGULATORY 

CHIJJ)-INITIATED 
INTERACTIONS 

PROGRAMMED ACADEMIC 

EXPRESSIVE 

CHILD SELF-LEARNING 
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Analysis Methodology 



In this section we describe the method and procedures used to analyze 
the data for evidence of .program impacts. Included are statements of the 
formal hypotheses to be tested, a description of the statistical model 
and procedures for testing thr>se hypotheses (including attention to under- 
lying assumptions), and discussions of interpretive difficulties and 
caveats associated with the nature and scope of this interim evaluation 
sample . 

Basic Evaluation Hypotheses 

In the preceding sectiorjs we described the basic FT evaluation 
design, sample, and measurement program with respect to a set of explicit 
evcilMation questions designed to determine the relative impact of the 
Follow Through prograjii. These specific questions can be grouped as 
f c ilows : 



• General Impact — How effective is Follow Through as a method 
of improving life chances of participating children? What 
is its impact on parent and teacher behavior and attitudes? 
What is its impact on school and community reform? 

• Specific Impact — What is the impact of specific program com- 
' ponents, or "planned variations" of Follow Through on child,, 

parent, teacher, school, and community characteristics? 

The strategy adopted for evaluating ^ Follow Through — that is, for 
answering the above questions — is to compare at various points in time 
children, parents, teachers, etc., who are participating in an FT program 
with those who are not. This basic treatment versus control logic enables 
assessment of impact at three levels of specificity — FT in general (over- 
all effects), FT planned variations (sponsor effects), and project by 
project outcomes. Stated in the form of null hypotheses, these three 
levels of specificity of the evaluation question can be formed as fol- 
lows : 

(1) Overall — Other things being equal, the mean performance of 

FT pupils, parents, or teachers will not differ reliably from 
that of the NFT sample at each asses smen t point (1 year , 
2 years) and as measured by the respective pupil, parent, 
and teacher outcome variables. 
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(2) Planned Variation — Other things being equal, the mean per- 
formance of pupils, parents, or teachers participating in 
a given Follow Through model (or group of models) will not 
d if f er reliably from that of the appropriate control group 
at each assessment point (1 year, 2 years) and as measured 

by the respective pupil, parent, and teacher outcome variables. 

(3) Project Outcomes — Other things 'being equal, the mean per- 
formance of pu^)ils, parents, or teachers participating in a 
given FT project (regardless of sponsor) will not differ 
reliably from that of the project control group at each 
aj^se^sment point and as measured by the respective pupil, 
parent, or teacher outcome variables. 

Several properties of these hypotheses deserve consideration. First, 
three different evaluation perspectives are afforded by the three classes 
of hypotheses. The overall hypotheses represent FT as a national program 
of assistance and intervention and assess impact on this level (i.e., 
without specific regard for "planned variation"). This approach can be 
seen as asking the question, "On the average, is Follow Through as a 
national program of intervention producing measurable impact?" The spon- 
sor, or planned variation, hypotheses ask slightly more specific versions 
oi this question; that is, "Have at least some of the approaches or plan- 
ned variations produced measurable impact?" or more specifically, "Have 
some FT projects produced measurable impact?" 

Second, each level can be divided into several subhypotheses . Sub- 
hypotheses regarding pupil, parent, and teacher effects are straightfor- 
ward. Subhypotheses regarding cur.ulative one-year and two-year effects' 
can also be formulated. These are not rival hypotheses (i.e., one-year 
versus two-year) but, rather, correspond to assessment intervals and 
allow examination of patterns of change. Specifically, under these 
cumulative effects hypotheses, we ask the questions, "Does Follow Through 
produce a measurable impact after one year of implementation (overall, by 
sponsor, by project), after two years of implementation (overall, by 
sponsor, by project)? 

( 

Third, separate replications of the FT experiment are represented 
in the successive cohorts and in the separate grade level entries within 
'^.ohorts. These characteristics of the design enable the formation and 
testing of several subhypotheses corresponding to program development or i 
improvement with different subpopulations over time. For example, "is 
the probability of first year effects greater for Cohort II than for 
Cohort I (i.e., indirect evidence of improved program implementation)? 
Does FT appear to produce more measurable impacts on K than on EF cohorts?* 
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And finally, contrasting first year effects., o^ Cohort II-EF against 
second year offects for Cohort I-K constitutes a test of differential 
cumulative impact with developmental age held constant. In this interim 
analysis, this test can be applied only to pupils in tlie first grade and 
is equivalent to asking the question "Does FT have a greater impact on 
first grade pupils after two years (Cohort I-K sample) than after one 
year (Cohort II-EF sample) of participation and services?" 

In suiiimary the evaluation design, including the sampling plan and 
assessment schedule, provides a basis for analysis of Follow Through 
effects in a large variety of contexts. Comparison can be made at the 
overall, sponsor,, and project levels; within cohort samples at varying 
grade levels and years of experience ; and across cohort samples in terms 
of experience year, grade level, and both. 

ANCOVA: The Method of Analysis 

Analysis of covariance (ANCOVA) was the method chosen for integrat- 
ing data from the various sources and performing tests of the major and 
subhypotheses. The specific ANCOVA model ^mployed is a fixed effects, 
one-way design in which approaches (planned variations) are arranged as 
the treatment variables, and control variables are used as covariates . 
Furthermore, within each treatment group in the one-way design, the FT 
project samples and the corresponding NFT control groups^ are nested. 
This analysis design was implemented separatply for pupil, parent, and 
teacher variables and for each cohort sample, grade stream, and assess- 
ment interval. Thus, 3 (pupil, parent, teacher) X 2 (Cohort I, II) 
X 2 (K, EF), or 12 independent analyses were performed on 1971 data, 
and 2 additional analyses were performed on the 1970 K and EF pupil data. 
Analysis of teacher and parent variables was not performed on the 1970 
Cohort I subset. 

The units chosen for these ANCOVAS were classroom aggregated vari- 
ables. Parallel analyses were also conducted on these classroom units 
in terms of sponsor and project level definitions of the treatment 

c 



* 

We recognize that the 1970 and the 1971 analyses on Cohort I data are not 
independent, since the measurement units of the 1970 analyses were a 
subset of those for 1971. "however, a mqre complex repeated measures 
analysis was considered inappropriate because measures and classroom 
groupings changed from 1970 to 197 !• 
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variable. Detailed discussions of issues and arguments leading to these 
analysis decisions are presented in Annex A of this report. Where avail- 
able, relevant dataalso are presented. The interested reader is strongly 
urged to refer to Annex A for a more thorough explanation of our analysis 
methodology. 

» ... 

Analysis at the Sponsor Level — The one-way fixed effects analysis 
of variance-covariance design for assessing sponsor level effects is 
displayed as follows : 



Treatment 



SPONSOR A 


SPONSOR B 




SPONSOR X 


FT 


NFT 


FT 


NFT 




FT 


NFT 

















Each cell in this design contains all the classroom aggregated observa- 
tions, both dependent variables (DV*s) and covariables, within a i^iven 
grade stream and cohort and organizos these in terms of , sponsors . Since 
this design collapses data across projects within sponsors, project 
descriptors are included as covariables to control for inter-project 
variability. This design constitutes an analysis on average sponsor 
effects within cohort and grade stream. To test the individual average 
effect for a given sponsor, a planned comparison (linear contrast) of 
the FT versus NFT cell (treatment level) is used. 

The degrees of freedom for treatments, covariates, and error asso- 
ciated with each separate implementation of this analysis design • 
(4 cohorts X 3 classes of DVs + 2 subset analyses = 14 analyses) are sum- 
marized in Table 13. The "total*' entry in this table indicates the number 
of classrooms for which data were sufficiently complete to be included in 
the respective analysis. The "treatments" entry is one less than the 
number of cells and also one less than twice the number of sponsors in- 
cluded in the analysis (each sponsor contributes two "levels" of treat- 
ment: FT and NFT). Finally, the entry for covariates indicates the 
number of separate covariables used in the particular analysis. 
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TABLE 13 

DEGREES OF FREEDOM FOR SPONSOR LEVEL ANCOVAS 





PUPIL 


PARENT 


TEACHER 


SOURCE OF VARIANCE 


VARIABLES 


Vi\RIABLES 


VARIABLES 


COHORT I-K, 2-YR 








TREATMENTS 


23 


23 


21 


COVARIATES 


19 


8 


10 




«J _L »J 


9Q7 


991 


TOTAL 


356 


329 


253 


COHORT I-E, 2-YR 








TREATMENTS 




15 


13 


COVARIATES 


18 


8 


10 


ERROR 


94 


o*j 




TOTAL 


128 


109 


74 


COHORT II-K, 1-YR 








TREATMENTS 


' 13 


11 


7 


COVARIATES 


19 


8 


7 


ERROR 


44 


52 


4 


TOTAL 


77 


72 


19 


• 

COHORT II -b, 1-YR 








IRh AlMiiiN lo 


/ 


5 


7 


CO VARIANTS 


15 


8 


7 


ERROR 


8 


10 


12 


1\J 1 ALi 


O 1 

ol 


O A 


27 


COHORT 1-K> >YR 








TREATMENTS 


19 






COVARIATES 


19 






ERROR 


168 






TOTAL 


207 






COHORT I-E, 1-YR 






I 


TREATMENTS 


11 






COVARIATES 


17 






ERROR 


47 




f 










TOTAL 


76 
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Analysis at the Project Level — The one-way ANCOVA design lor the 
project level analyses is formally identical to the sponsor design de- 
scribed above. The only distinction between the two is in the definition 
of "treatment*'; in this case the individual project serves to define 
treatment. The structure of this analysis within a given grade stream 
and coliort is as follows : 



Treatment 



Project a 


Project b 




Project z 


FT 


NFT 


FT 


NFT 




FT 


NFT 


f 








• • * 







In this analysis design, projects are substituted for sponsors in 
defining the treatment variable. This design affords precise control 
over project variability and constitutes an analysis of average project 
effects. To test the effect of individual projects, planned comparisons 

oi' FT versus NFT within project were again used. 

I 
I 

The degrees of freedom for treatments, covariates, and error for 
each of the 14 project level analyses are summarized in Table 14. 
Table 14 shows that the project analyses employ more treatment levels 
and fewer covariables than the sponsor analyses. Since projects contain- 
ing fewer than two FT and two NFT classes were excluded from project anal- 
yses, totals for Sponsor and Project analyses occasionally differ. For 
several analyses — CIl-K teacher; CII-EF pupil, parent, teacher, and CI-E 
(first >ear) pupil — the two designs were identical since the number of 
pro{jects and sponsors were equal. 

Finally, analyses of overall effects were conducted on rhe project 
analyfeis design. These analyses consisted of comparing the average of 
all FT versus all NFT treatments within a given ANCOVA. 

All results of hypotheses tests were interpreted by means of 95 
percent confidence intervals . 
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TABLE 14 

DEGIIEES OF FREEDOM FOR PROJECT LEVEL .AuNCOVAS 



SOURCE OF VARIANCE 


PUPIL 
VARIABLES 


PARENT 
VARIABLES 


TEACHER 
VARIABLES 


COHORT I-K, 2-YR 








TREATMENTS 


55 


51 


41 


rOVART ATRvS 


16 


g 


7 


ERROR 


258 


236 


146 


TOTAL 


330 


296 


195 


rOHORT T-F 








1 LXLj r\ 1 iVLl-i Jli 1 O 


91 


1 Q 


1 ^ 

Xvl 


POVAR T ATVCI 

\^\J V rirv 1 IJCj O 


1 R 


Q 
O 


7 


ERROR 


85 


77 


41 


TOTAL 


123 


105 


64 


COHORT I I-K, 1-YR 








TREATMENTS 


15 


13 


7 


COVARIATES 


16 


B 


7 


ERROR 


19 


24 


4 


TOTAL 


51 


46 


19 


COHORT II^E, 1-YR 








TREATMENTS 


7 


5. 


7 


COVARIATES 


15 


8 


7 


ERROR 


^ 8 


10 


12 


TOTAL 


31 


24 


27 


COHORT I-K, 1-YR 
TREATMENTS 
COVARIATES 
ERROR * 

TOTAL 

COHORT I-E, 1-YR 
TREATMENTS 
COVARIATES 
ERROR 

TOTAL 


23 
15 
113 

152 

1 ! 

13 

15 

47 

76 
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Guide to Interpretation of Results 



Data tables that display input/control and outcome values in accor- 
dance with the analysis procedures'' described in the preceding sections 
are presented in the "Results" section. These tables were prepared by 
combining relevant data from separate analyses (pupil, parent, teacher) 
into single displays organized in terms of the planned comparisons pre- 
viously described. 

The individual project is the basic organizational unit for presen- 
tation of results. Separate results tables were prepared for each set 
of data analyzed. If more than one set of data was analyzed for a par- 
ticular project (e.g., CI-K and CII-K) each is presented and discussed 
within the context of the project. Therefore, more than one data table 
is presented for some projects. 



There are three categories of entries on individual project data 
tables — input/control variables, outcome variables, and project/process 
descriptors. Both input/control and outcome measures are summarized at 
the appropriate evaluation level — pupil (classroom averaged), parent 
(pupil-classroom averaged), anil teacher. Depending on the measure, pupil 
and parent covariates are displayed as raw score averages or as percent 
averages. Teacher covariates are raw score averages on the respective 
variables . 



Pupil outcomes are all presented as raw score averages. Parent and 
teacher outcomes are all presented as scale score averages, where the 
scale has a mean of 0.0 and a standard deviation of 1.0. 

Project descriptors are presented once for each project, as raw 
scores on the respective variables. Classroom observation scores are 
presented as average factor scores for each of the five process dimen- 
sions , 



Within each evaluation level (pupil, parent, and teacher), average 
values for the FT and NFT classes and the average FT/NFT differences are 
entered for each variable. These entries display the baseline, unadjusted 
out 'ome, and adjusted outcome subgroup averages and differences for each 
pi ject sample. Standard errors for adjusted FT/NFT mean differences are 
also presented . 

Significance tests for outcomes are presented in the form of 95 per- 
cent confidence intervals for the adjusted mean FT/NFT differences. Con- 
fidence intervals which do not show sign changes (+ to -) across the 
interva] indicate significance. These intervals can be read ds follows: 
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"with 95 percent probability, the true mean difference between FT and 
NFT on X measure is between (lower boundary) and (upper boundary) units," 
In general, if the algebraic sign preceding both confidence interval 
boundaries is positive, then the results shows significance in favor of 
FT, If both signs are negative, then the difference is significantly in 
favor of NFT, The exception is the attendance measure, which shows the 
avrrage days absent during the preceding school year. Here negative dif- 
ferences favor FT (less absence). 

Explanation of Entries in ^the Project ^ata Tables 

Figure 4 shows the organization of the project level evaluation 
data as summarized in the project data tables. A complete project data 
table contains eight groups of entries. These groups, which are cor- 
respondingly numbered 1-S in Figure 4 are the following: 



* 



(1) 


Pupil baseline/control measures 


(2) 


Parent control measures 


(3) 


Teacher input/control measures 


(4) 


Pupil outcomes* 


(5) 


Parent outcomes 


(6) 


Teacher outcomes 


(7) 


Project descriptors 


(8) 


Classroom observation factor scores. 



The entries in the area of Figure 4 designated as "l" provide data 
on child baseline controls. The following are detailed descriptions of 
individual entries; 

No, of classrooms — the number of classroom ' le^'el aggi'egat ions 

of pupils for which data are sufficiently 
complete 90 percent) for analysis* 

Averago pupils/classroom — the average number of pupils who met 

the completeness of data requirement. 
Average classroom siz*e represents the 
average frequency for which data on all 
1 / measures are available. Frequencies 

vary considerably from measure to measure^ 



Refer to Table 5 or Annex B (Table B-2) for maximum scores on baseline 
and outcome measures at various grade levels and assessmemt points. 
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Quanta prescore — the raw score (numbT correct) on the baseline 
test (maximum varies u^ross cohorts and grade 
streams) . 



Cog . Process prescore — the raw cognitive process prescore , 
Reading prescore — the raw reading prescore. 

Language prescore — the raw language prescore. 

Affect prescore — the raw affect prescore. 

Age (June ^ 71) — the average age in months of 
pupils as of June, 1971. 

% Classroom male — the percentage of boys in the classroom, 
averaged across classrooms. 

% Classroom Black — the percentage of Blacks in the classroom, 

averaged across classrooms . 

?f Preschool— either : (a) the average percentage of pupils wlio 
had at least some preschool (CI only) or (b) the 
average number of months of preschool (CI I only). 

% Parents w/o HS diploma — the percentage of parents of pupils in 

the class who DO NOT have high school 
diplomas, averaged across classrooms. 

;c Parents w skilled occup. — the average percentage of parents of 

pupils in the class who have occupa- 
tions classified as "skilled"--! . e . , 
professional, clerical, manager, sales 
craft, etc., but NOT service, laborer, 
operative, housewife, etc. 

POVERTY DEFINITIONS 







N0N-FARR1 


FAMILY 


SIZE 


FARM 


FAMILY 


SIZE 


DOLLAR 




CERT. 


POSS. 


NOT 


CERT. 


POSS. 


NOT 


INCOME 




POOR 


POOR 


POOR 


POOR 


POOR 


POOR 


1,090 




> 1 






> 1 






1,000-2, 


999 


> 3 


2 


1 


> 4 


2-3 


1 


3,000-4, 


999 


>7 


3-6 


1-2 


>8 


4-7 


1-3 


5,000-7, 


499 




7-10 


1-6 


> 13 


-8-12 


1-7 


7,500-9, 


999 


> 15 


11-14 


1-10 


> 17 


13-16 


1-12 


10,000+ 




>19 


15-18 


1-14 


> 21 


17-20 


1-16 
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% Parents Black — average percentage of pupil parents who are Black « 

% Poverty el igible--average percentage meeting OEO poverty guide- 
lines as shown in the preceding (page 84) 
tabulation . 

% Head of household employed — the average percentag3 of heads of 

households who report they are cur- 
rently employed (parent interview 
responses) . 

% Head household male — the average percentage of heads of house- 
holds who are male. 

The entries in the area of Figure 4 designated as "2" present con- 
trol data on parents. The following are detailed descriptions of indi- 
vidual entries: 

No, classroom units — number of parent groups of data, aggregated 

in terms of the corresponding pupils' class- 
rooms. 

Av. parents/classroom grp — average number of parents per unit 

meeting the completeness of data re- 
* quirement. Average classroom group 
size represents the average frequency 
for which data on all ..^neasures are 
available. Frequencies vary consider- 
ably f rom^jneasure to measure, 

,0 w/o high school diploma — see child definitions. 

% w skilled occup . — see child definitions. 

Pos . eval. of child lrng --satisf action rating of parent ("very" 

to "not") with child's progress in 
school; percentage of "very satisfied" 
responses . 

% Black — see child definition. 

% Repor -^i ng use of preschool — percentage stating their FT/NFT 

child participated in some pre- 
school program . 

% Poverty el igible — see child definition . 

% Head household employed —see child definition. 

% Head household male — see child definition. 
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The entries in the area of Figure 4 designated as 3 display input 
data on teachers. The following are detailed descriptions of individual 
entries : 

No, of classrooms — number of classrooms for which data on teacher 

variables are sufficiently complete. 

Job satisf> rating — the average scaled satisfaction response 

(3 = very satisfied, 2 = satisfied, 1 = dis- 
satisfied) regarding working conditions, i.e,, 
equipment, supplies, classroom space, schedule, 
salary , and planning time . 
Book Resource scale — the number of book and library facilities 

available to the students — classroom, take 
home, central library, etc. (maximum score 
= 6) . 

Race (Black/non-Black) — Black = 1; non-Black - 0,* 

Ident> w community — the extent to which the teacher is both a 

longtime member of the community and a 
resident of a neighborhood similar to that 
of the pupils (maximum score = 2) . 

No. of helpers ' — the minimum number of helpers (aides or volun- 
teers) utilized in the class. 

Able to choose assignment — the extent to which the teacher was 

able to choose her current school and 
classroom teaching assignment (2 = 
teacher* s own choice; 1 = at request 
of other; 0 = assigned, no choice — 
each for school and classroom assign- 
ment) , 

Trng & teacher experience — an aggregate training and experience 

variable indicating the extent of for-- 
• mal education, degrees held, specific 

course completions, certification, years 
of experience, grade levels taught, and 
tenure of the teacher (maximum score = 12 
points) • 

« 

The entries in the area of Figure 4 designated as "4" display outcome 
data on pupils. The following are derailed descriptions of individual 
entries : 



* 

Cohort II teacher ethnicity coding is reversed: Black = 0, non-Black =1, 
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Overall achievenent ^-the raw score* sum of correct responses to 

all cognitive test battery items • 

Affect — the scaled sum of seven responses to the attitude inven- ^ 
tory (3 = happy, 1 = sad; min. = 7, max. = 21), 

Attendance — the average number of days absent from class in the 
1970-1971 school year, 

WRAT — the raw score sum of correct; responses to tha WRAT. 

Quantitative — the raw score sum of correct responses • 

Cognitive processes --the raw score sum of correct responses, 

Reading skills --the raw score sum of correct responses. 

Language arts --the raw score sum of correct responses. 

Two entries are made for the subgroups on each measure. The out- 
come entry shows the actual or unadjusted posttest averages for FT and 
NFT subgroups on each measure. The adjusted outcome entries show these 
values after regression for differences on covariables (i.e,, differences 
on baseline/control averages). 

The entries in the area of , Figure 4 are designated as "s present 
outcome data on parents. Tb- following are detailed descriptions of 
individual entries : 

Parent/child interact — the extent to which parents report they 

actively interact with their children in 
such activities as talking with their child- 
ren, taking their children on trips, helping 
their children with school work,' reading to 
them, accepting assistance from them, and 
acknowledging their progress in school. 

Parent/school involve — the extent to which parents report they 

are actively participating in' various 
.. . school-related activities, such as class- 

room visits, volunteer assistance, parent/ 
school meetings , external contacts with 
school personnel. 

Child academic expect — the extent to which the parent reports 

satisfaction with child's progress and 
optimism regarding th? child's future, 
both academic and nonacademic (e.g., what 
are the child^s expected grades, chances of 
getting a good job, chances of going on to. 
college?) . 
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Sense of control -'-the extent to which the parent reports a sense 

of conciern and control ovei' school procedures, 
educational reforms , and school awarenes s of 
and responsiveness to parent and communi ty 
desires and needs. 

These entries are all expressed as standardized (M =0, SD = 1) scale 
values and are displayed in unadjusted and adjusted forms. 

The entries in the area of Figure 4 designated as "6" present out- 
come data on teachers. The following are detailed descriptions of in- 
dividual entries : ' 

Parent-educator image --the extent to which teachers reported they 

felt it essential to "get together with 
parents outside of the classroom" for pur- 
poses of : 

• Improving children's lecirning 

• Improving classroom teaching 

• Learning parent's views on teaching 

• Imprc ving school serv ices to parents 

• Improving school services to children 

• Improving school services to community 

• Parental understanding of school program. 

Professional accept of method — the extent to which the teacher 

reports she would not prefer to adopt 
, some teaching approach other than the 

* • one she is currently using. 

Like the parent outcomes, teacher outcomes are expressed as standardized 
scale scores and are shown in adjusted and unadjusted forms. 

The entries in the area of Figure 4 designated as describe the 

project. The following are detailed descriptions of individual entries. 

Region — geographic location of project , 

Distance to nearest SMSA — within SMSA ; 20 miles from nearest 

SMSA; 30 to 40 miles from nearest SMSA; 
50-70 miles from nearest SMSA; or 75 to 
120 miles from nearest SMSA. 
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Size of nearest SMSA — 1970 population of nearest SMSA for those 

communities not within a SMSA, 

Percent nonwhite -- per cent' of communi ty popula tion that cons is ts 
of minority group members. 

Project size (pupils ) --number of participating pupils as projected 

in the 1970-71 project application. 

No. pupils/PAC member — projected 1970-71 enrollment divided by the 

size of PAC as reported in 1970-71 Kiembership 
listing. 

FT per pupi 1 expend iture -- total ant ic ipated funds supplementing 

district maintenance of effort (sometimes 
includes Title 1 funding) divided by 
projected enrollment . 

The entrie.9^in the area of Figure 4 designated as '^s" give classroom 
observation factor scores. The following are detailed descriptions of 
individual entries : 

Self -regulatory — children working independently on activities 
not strictly academic. 

Child-initiated interactions — children initiating interactions 

and receiving positive or negative 
feedback from adults. 

Programmed academic — adults teaching small groups of children by 

highly structured question-response-reinforcement 
interact ions , ^ 

Expressive — positive and negative affect expressed by both children 
and adults . 

Child self- learning — children working alone with books or seat- 
work materials. 
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Section IV 
RESULTS 

I 

Introduc tion 

This section is divided into the following three parts: 
! 

(1) Evidence of results at the project level presented on a sponsor 
by spon sor bas is 

|(2) Overall results presented on a cohort basis 

(3) Independent evidence of FT planned variations. 

I I 

Part 1 presents the results and interpre tat ions of individual project 
outcomes, both CI and CI I. Because of the quantity and complexity of 
these results, several steps are followed to organize and simplify the 
presen tat ion . 

First, all results are organized in terms of i'ndividual projects 'as 
in the interpretive guide. These project-level results are further grouped 
by sponsor. The reasons for this organization of the interim results are 
the following: 

• The objective of this interim evaluation is not to contrast each 
approach or planned variation to others, but to assess the impact 
of each approach on pupils, parents, and teachers by comparing 
their development with the development of NFT comparison. 

• Organization of impact evidence on the basis of planned variations 
allows comprehensive evaluation of how outcomes compare with the 
goals of the specific approach and how they are related to the 
characteristics of that approach. 

The presentation of the results at the project level is organized 
as follows ; 

(1) A comprehensive description of the planned variation or approach, 
which was produced by SRI m collaboration with the sponsors, 
is presented . This description of the approach presents the 
sponsor's intended goals; it does not necessarily describe what 
actually happened. 



(2) The results of individual CI and CII projects that employed this 
approach and for^which interim data have been analyzed are pre- ' 
sented. In general, the sequence of prese-ntat ion is from Cohort 1 
to Cohlort II samples (i.e., two-year resi:l ts to one-year -results) . 
Also, first-year (Spring 1970) results of Cohort I are compared 
with the corresponding second -year results (Spring 197.1) when 
both are available and, with Cohort II first-year results, when 
they are available. 

(3) A summary of results is presented for the sponsor. This summary 
includes an outline of w]hat we believe to be the salient features 
of the planned variation (objectives, curriculum emphases, 
parent component). The resultS| of sponsor level analyses are 
included only if two or more projects at a particular cohort and 
grade stream are present. Separate sponsor-level analyses for 
individual projects are redundant and , therefore, are not included 
in the summary. [j 

Part 2 of this section presents summary table s and discussions of 
evidence of overall effects at the cohort level. In this presentation, 
we attempt to show how results are related to the quality of the matching 
of comparison groups with FT groups. These interpretations and evaluations 
are informal and descriptive. 

Part 3 documents the extent to which FT planned variations currently 
exist. These evaluations are based directly on analysis of classroom 
observation variables and factors across projects and. cohorts. 



94 



Part 1 



RESULTS FOR EACH PROJECT 
BY SPONSOR 



RESPONSIVE EDUCATIONAL PROGRAM 
Far West Laboratory for Educational Research and Development 



Sponsor Intended Approa-jh 

Learning activities that are self -reward ing (autotelic) and an environ- 
ment structured to be responsive to the individual child's needs, culture, 
and interests are the main principles in this model. The autotelic prin- 
ciple states that the best way for a child to learn is for him to be in 
an environment in which he can try things out, risk, guess, ask questions, 
and make discoveries without serious psychological consequences, Autotelic 
activities include learning activities that help the child develop a skill, 
leai'n a concept, or acquire an attitude that can be usefully applied in 
some other endeavor. 

This sponsor believes that rewards are intrinsic within an activity 
and that the child gets feedback from physical materials as well human 
interactions. Thus, he need not depend solely on the authority of the 
teacher for rewards, punishments, or feedback. The child becomes self- 
directed and develops inner controls. 

The goals of the model are for the child to develop his intellectual 
abilities and to develop a healthy sel f -concept , A healthy self-concept 
allows the child to accept himself and his culture, to make realistic 
estimates of his own abilities and limitations, and to have confidence 
in h|Ls own capacity to succeed. Such a child is willing to take risks, 
learns from his mistakes, and feels safe in expressing his feelings. 
He learns to apply all his resource s--emotional , physical and intellec- 
tual-- to the process of sclv ing pro bl ems with in his environment . 

In the Responsive Model classroom the child is free to explore within 
a carefully controlled environment containing learning centers and a 
variety of games and activities. There is freedom to choose activities 
within already established limits. What he chooses to do is more likely 
to become important to him, to stimulate affective invol vevnent , and to 
pose real problems. The child searches for solutions to problems in his 
own way using a variety of resources, both physical and human. The teachers 
guide his discovery of solutions. The child finds out if his solutions 
work. Solutions he discovers often fit together and lead to other dis- 
coveries. The child's reward is what he gains from the entire experience. 
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Learning sequences have been developed for the model, but each child 
may work at his own pace. There are no constraints to masffer given lesson 
content by a given time. It is assumed in the model that no single theory 
of learning, can account for all the ways in which children learn. What 
is considered essential is that a variety of educational alternatives be 
available to build on whatever background, cultural influence, or life 
style the child brings to school. 

The sponsor of this model trains a person from the local community 
to act as Program Advisor. The Program Advisor conducts inservice train- 
ing for all staff and parent groups and is responsible for carrying the 
model's program into the classroom. One aspect of the training includes 
developing career-directed jobs for parents as teacher assistants, typing 
booth attendants, and the like. The training program is the first concern 
in evaluating the model overall. An attempt is made to determine how ef- 
fective the training program is in producing the changes in teacher behavior 
required' to It element the model and whether xhe changed behavior indeed 
affects the growth of childrpn toward the self-concept and intellectual 
objectives of the program. 

Since the approach taken by the Responsive Model places equal respon- 
sibility for the child's education on the home, particularly heavy emphasis 
is placed on parent involvement. Parents are offered training during which 
they are familiarized with the program and trained to pursue its objectives 
in the home. A game and toy library is available for parent ust, and it 
includes f ilmstrips and cj.udio tapes that demonstrate how the toys and games ' 
should be used. The sponsor also offers a course to teacher-librarians 
so they can further assist parents in the application of program materials. 

In addition to the parents trained specifically for employment in the 
project, parents in general are invited to participate in classroom activ- 
ity on a volunteer basis. This gives them the opportunity to become aware 
of the kinds of adult-child interactions that contribute to the child' s 
success in school and to become familiar with the principles and the 
activities of the program. The purpose of the carefully planned parent 
involvement demonstrated by this model is to train parents for the leader- 
Ship and policy-making roles the sponsor feels they should assume in the 
education of their children. 

Individual Project Results 

Seven samples from three different projects sponsored by Far-West 
Laboratory for Educational Research and Development (FW) were included 
in the analysis of the interim effects. The distribution of these evalua- 
tion samples in terms of cohort, outcome, and project is as follows: 
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Cohor c 



First"*year Effects 



Second-year Effects 



IIK 



IK 



(project b) 
(projects a, b, & c) 



(projects a, b, & c) 



Project FW(c) 

Project FW(c) is a relatively small, nearly all white project, 
located some distance from a relatively small (95,000} N^w England cify. 
The anticipated per pupil Follow Through expenditure in this project was 
$634, and there was one PAC member, on the average, for each 13.7 pupils. 
Two separate samples were analyzed for this pr^oject, a Cohort I-K second- 
year sample and a Cohort II-K, one-year sample. 

The results of the Cohort I, two-year analysis are presented in Table 
15. This analysis was based on a sample of four Follow Through and three 
non-Follow Through classrooms. 

These sets of classrooms were only moderately similar. NFT class- 
rooms averaged consistently higher than FT classrooms on the baseline 
measures and were more than 10 points above the FT classrooms on the 
reading prescore. As usual, the Follow Through classrooms had a higher 
proportion of preschool experience but appeared to have families quite 
different from those of the non-Follow Through pupils. Specifically, 
FT parents tended to have substantially less education, were more likely 
to be unemployed, were in lower occupational categories, were poverty 
eligible, and had fewer male heads of household than NFT families. 
Teachers, on the other hand, appeared somewhat better matched on the input 
variables. For example, the FT and NFT samples differ only in the number 
of helpers and the relative exper ience of teachers . 

The analysis of covariance on pupil outcomes showed a- significant 
difference in favor of Follow Through on the quantitative skills measure. 
Other test measures failed to reveal significance, but generally showed 

differences In favor of the Follow Through sample. The parent outcomes 
and teacher outcomes presented in Table 15 indicate that no other results 
reach s ignif icance for this sample . 

The Cohort II first-year results for this project are summarized in 
'lable 16. The baseline averages for these three FT and two NFT classrooms 
indicate that the comparison group match is only slighly better that that of 
the Cohort I sample in Project FW(c) . The preschool averages for the two 
groups were approximately the same except for "affect," in which FT was sub- 
stantially higher than NFT. However, in the NFT classrooms there were con- 
siderably more boys than girls, whereas in the FT classroom there were about 
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the same number of boys and girls. In addition, the FT parents were less 
likely to have a high school education or be employed in a skilled occupa- 
tion. The head of a FT household was less apt to be employed and less 
apt to be niajiP than the head of an NFT household. Nearly 20 percent of 
the FT families were poverty eligible under the OEO criteria, whereas 
none of the NFT families were. Teachers in the two samples were moderately 
comparable. FT teachers appeared more satisfied with their jobs, had 
more book resources, were better integrated in the communities in which 
they taught, and had more helpers than NFT teachers. However, FT teachers 
reported less freedom in choosing assignments and were less experienced 
than NFT teachers. 

Analysis of pupil outcome measures in terms of these background differ- 
ences revealed a significant FT superiority only in measures of cognitive 
processes. Outcomes of parent and teacher measures failed to show signifi- 
cant differences between FT and NFT groups. 

In summary, the only significant differences between FT and NFT groups 

in Project FW(c) were the superiority in quantitative skills of the FT 

group in the Cohort I, second-year sample and the superiority in cognitive 
processes of the FT group in the Cohort II, first-year sample. 

Project FW( b) 

Project FW(b) is a moderately large project of 720 pupils, primarily 
white, located within a city of 138,000 in the west north central region. 
The anticipated per pupil expenditure in this project was $606, and, on 
the average, there were 19 pupils per PAC member. 

Analyses were performed on three sets of data for Project FW(b) , 
a Cohort I-K , second-year effects analysis, a Cohort I-K, first-year effects 
subset, and a Cohort II-K, first-year effects analysis. Both the Cohort 
I and the Cohort li classrooms were included in the classroom observation 
sample . 

The results of the two-year effects analysis for Project FW(b) are 
summarized in Table 17. The six FT and the six NFT classrooms participating 
in this analysis were fairly well matched on pupil and parent variables. 
Substantial differences between FT and NFT groups were noted only in pre- 
school experience and parents' evaluation of the pupils' academic progress. 
Parents in both samples were moderately well educated and were employed at 
skilled occupations; less than half the families are poverty eligible, 
and more than three quarters of the family heads are males or employed 
or both. The teachers were also relatively well matched, differing notice- 
ably only in number of helpers and relative training and experience. 



ERIC 



102 



HH O 



W t-l w 



II 



H O 

s 5 



•I u > 



OMO)OMOOO<TiNCJt^OOO 



O in 
(o o in (D 



t/1 






Q 


:^ 


V s 


g 




! 






00 


rr ^ 

rH rH 1 


O w 


O rH 


in 



in 1* 



in tH 

CO in 



o to ^ tf" o in 



, 3 

: cj 



I fti w tr> 
a: u 



•5S 



1 w O 

: a u 

• o:: 

u a. 
I u 

; i S 



• -i a 

O Q M 

U U M _] 

J < as a: - 

5 J O 

a CQ ^ o w 



) M M 



3 3 I ^ § 

U U Cl* fit d, 

li« C^! is**- 6* t* 



ID in 30 



00 ^ O o> 



O dv 00 1-^ 

N «0 00 00 



«£s o ID in o 



m 0) 
(D m r) 



8 w 
b: m 



8 B 
g 8 



3 9 



W o o o 



U U M M 



S i2 d 



3 § 



Vi w 

^ 2 



d d o 



o o Ol I 



OJ O O O 00 O 00 

^ ^ ,-1 H *r 



2: < ii< 6? t*. 6? 6^ 6< e< 




ERIC 



103 



The analysis of pupil measures failed to display any significant 
differences between the FT and NFT samples, although means for the FT 
sample were slightly higher in each instance, an outcome that may indicate 
that FT had a small effect. 



FT parents showed significantly 
ties than NFT parents, a result that 
some impact on parents. However, no 
teachers reached significance. 



greatei* involvement in school activi- 
indicates that the program did have 
other outcome for either parents or 



The classroom process data, however, does indicate that the sponsor's 
program is being implemented. The above average score on self-regulatory 
and child self-learning factors and the below average score on the pro-- 
grammed/academic factor are consistent with sponsor goals. Only FT class- 
rooms were included in the sample, so contrasts between FT and NFT cannot 
be drawn. 



The first year effects for the Project FW(b) , Cohort I sample are 
summarized in Table 18, The results indicate that none of the test score 
differences reach significance. A comparison of the net scores (adjusted 
outcomes) shown in Tables 17 and 18 gives some evidence of a cumulative 
effect in that the second-year differences are more positive than those 
of the first year. This interpretation receives further support from 
data presented in Table 19, which displays the background properties and 
first-year effects for a Cohort II-K sample in Project FW (b) . The 
baseline variables for the pupils indicate that the four FT and two NFT 
classrooms are reasonably well matched on pretests and family character- 
istics. Teachers, on the other hand, appear somewhat different in that 
NFT teachers tended to have more book resources and freedom to choose 
assignments, whereas FT teachers had more classroom helpers and, in 
general, higher levels of training and experience. 

Analysis of the FT/NFT differences on each outcome measure revealed 
that only for parent/child interactions were the differences significant. 
More interactions were reported by FT parents, a result that is consistent 
with the model^s goals, one of which is to encourage parents to participate 
directly with their own children inside and outside the classroom. None of 
the pupil or teacher measures reached significance. The results for Cohort 
II, first-year pupils and thos.e for Cohort I, first-year pupils tend to 
favor NFT. This evidence may indicate that the model implemented in 
Project FW(b) has an initial disruptive effect. This finding coupled 
with some more positive Cohort I results obtained at the end of the 
second year suggests that the project effects measured by these outcome 
variables are not likely to be immediate or positive. This conclusion 
is supported to some extent by the process data for the two separate 
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samples, which interestingly show the same general pattern of the factor 

scores. However, Cohort II is much higher on the self-regulatory factor 

than Cohort I, while Cohort I is much higher on the self-learning factor. 
Both factors are representative of Far West's goals. 

Project FW(a) 

Sponsor FW evaluation data were also collected and analyzed for Cohort 
I and Cohort II in Project FvV{a) . Project FW{a) is a relatively large 
project, located within a very large urban area (3 million people) in 
the Pacific region. The anticipated per pupil expenditure was $653, and 
there were slightly more than 21 pupils per PAC member in an average 
classroom. 

Table 20 summarizes the results of the analysis. The Cohort I sample 
for Project FW(a) consisted of nine FT and four NFT classes* NFT classes 
were fairly well matched to FT classes on baseline test scores but were 
clearly different in ethnic composition, FT claissrooms being predominantly 
Black and NFT classrooms predominantly non-Black. However, except for 
this difference in racial make-up and the usual greater preschool participa- 
tion of FT pupils, the two samples appear quite comparable. Teacher 
analyses were not performed because of insufficient data. 

I 

The only pupil variable that differed significantly between the two 
groups was attendance which was higher for the NFT sample. The only 
parent variable that differed significantly was sense of control; the NFT 
parents reported a greater sense of control than FT parents (the 95 percent 
confidence interval favored NFT by ,34 to 1,|90 scale units). The classroom 
process characteristics summarized in Table| 20 suggest a pattern somewhat 
different from the patterns of other s^amples of this model. These class- 
rooms appear not only quite high on the self-regulatory factor, which is 
characteristic of the model, but also above average on the programmed/ 
academic factor. The programmed/academic factor represents an educational 
format not usually associated with this sponsor's goals. 

The remaining analysis of Project FW(a) is of Cohort II first-year 
data, which are summarized in Table 21. The baseline values for this 
sample show that six FT and two NFT classrooms are well matched. However, 
a problem with the preschool data from the pupil rosters precluded inclusion 
of preschool experience for this sample. This omission accounts for the 
discrepancy between the average preschool values reported for the pupils 
and those reported by the parents. Again, teacher data were insufficient 
to enable analysis for this sample. 
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First-year impact results show that the NFT sample significantly 
^exceeds the FT classrooms on the cognitive measure and on daily attendance. 
No other measures, either pupil or parent, show significant FT/NFT differ- 
ences. 

The process scores for this sample indicate a pattern consistent with 
the sponsor's nodel . Specifically, the FT classrooms are above average 

Ion the self-regulatory factor and below average on the programmed/academic 
factor. Like Cohort I, this namrle is especially high on the expressive 
factor. The explanation ol the chili outcomes does not appear to lie in 

la lack of implementation of t^e model. 



Summary 

The salient features of the Sponsor FW model can be outlined as 
follows: 

Focus and Objectives — emphasizes long range program objectives 

Child 

Cognit ive 

Develop problem solving ability 

Affectiv e 

Develop self-direction 

Increase ability to take risks, learn from mistakes, and fee] 

safe in expressing feelings 
Develop a healthy self-concept 

Pa>"ent 

Develop parent's ability to teach his children 

Curricular Approach 

Teacher's role that of facilitator 
Intrinsic reinforcement from activities 

Individual child free to choose among sel f ^reward ing activities 

within structured environment 
Wide variety of activities available 

Type of Parent Involvement 

Heavy emphasis on training parents for employment in projects 
as teacher assistants 
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Two separate impact analyses wero conducted at the sponsor level. 
The first analysis was conducted on the Cohort second-year data, the 
second oij Cohort II first-year data. These results are summarized in 
Tablec 22 and 23, respectively. Th results summarized in these tables 
indicate that interim evidence at tbp i:'iJonsor level is inconclusive 
although not particularly favorabl-^. In particular, no pupil level re- 
sults are indicated in either the Cohort I or Cohort II analyses. How- 
ever, the model shows si^2:nif icant impacts on parents in each case. In 
Cohort I, the parent/school interaction variable is significantly higher 
for FT parents, and in Cohort II, the parent/child interaction variable 
is significantly higher in the FT group. Finally, although project level 
analyses were not possible for teacher data, aggregation to sponsor level 
did enable an analysis of the Cohort I data. However, the results which 
are displayed on Table 22, show that the effects failed to reach signifi- 
cance. 



In total, the interim statistical evidence on the effects of the Far 
West Laboratory's program is not particularly favorable. It may be that 
the FT battery, which measures more or less traditional academic be- 
haviors, is not particularly well suited to many of the objectives 
stressed by this sponsor's model, particularly at the kindergarten and 
first grade level. Nevertheless, given the emphasis this model places 
on parental involvement, it is difficult to understand the lack of con- 
sistent evidence of pos it ive impact on the parental outcome measures . 
Quite likely a program of this complexity (its primary focus is that of 
changing process) require^ more time to reveal \its true impact. 
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TUCSON EARLY EDUCATION MODEL (TEEM) 
University; of Arizona 

Sponsor's Intended Approach 

Participation in contemporary society requires skills and abilities 
missing in the behavioral repertoires of many individucls because their 
background does not provide an adequate foundation. The TEEM model at- 
tempts to solve this problem by providing children with educational ex- 
periences appropriate tc developing such skUli and abil it ies — beginning 
with the behavic characteristics and level/oi development with which the 
child enters school and working from there. The model calls on teachers 
to individualize their teaching and emphasizes persistent adult-child in- 
teraction on a one-to-one basis. To meet the needs and learning rates 
of individual children, the model provides a great variety of behavioral 
options, including both self-selected and structured activities. 

The curriculum for the model focuses on four general areas of develop- 
ment: language competence, development of an intellectual base, develop- 
ment of a motivational base, and societal arts and skills. An intellectual 
base includes skills assumed to be necessary to the process of learning 
(e.g., ability to attend, recall, organize behavior toward goals, and 
evaluate alternatives). A motivational base includes attitudes and be- 
havioi- related to productive involvement, such as liking school and learn- 
ing, task persistence, and expectation of success. Societal arts and 
skill acquisition include reading, writing, and math skills, corpbined 
with social skills of cooperation, planning, and the like. 

In this model a skill is always taught in a functional setting, and 
concepts are illustrated by a variety of examples across content areas 
both within and outside the classroom. Field trips, walks, and visits 
to the children's homes help the child generalize new skills to his own 
environment. The technique of simultaneously attending to developing 
langua^^e, intellectual, motivational, and societal skills in a meaning- 
ful setting is defined in the model as "orchestration." 

The TEEM classroom is organized into behavioral settings and interest 
centers for small groups to encourage interactions among the child, his 
environment, and others. Pupil groups are purposely heterogeneous so that 
children of different ability levels will learn from peer models and work 
independently with available materials. Imitation, a formal part of 
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classroom practice, is viewed as an especially important process in 
language acquisition. Social reinforcement techniques, such as praise, 
attention, and affection, are liberally applied, and materials are chosen 
and arranged for their reinforcing value. Every effort is made to ensure 
that the child will come to regard school as significant and rewarding. 

In the open-ended context of this mode"! , lessons and learning ex- 
perience are given definite structure and direction through careful plan- 
ning by the staff. Adults working in the classroom are trained to use 
the experiential background of pupils to further instructional objectives, 
and the home and the neighborhood are treated as instructional resources. 

The delivery system for the TEEM model includes programs and services 
developed to provide continuous input, demonstration, and evaluation to 
the community, the classroom instructional staff, and to parent liaison 
personnel. Field representatives visit sites to provide guidance and 
communicate questions and problems back to the TEEM center. School psy- 
chologists serve as consultants to teach project staff to apply psycho- 
logical techniques in defining and solving educational problems. Evalua- 
tion serv ices include a new program that clearly sets out objectives of 
the program and ways for the community to evaluate how well they are met. 

The model establishes positive and frequent contact between schools 
and parents to acquaint parents with the instructional program and to in- 
fluence them to part ic ipate in school-related activities, work with the 
Policy Advisory Committee, serve as classroom volunteers, and train for 
new careers. An attempt is made to provide parents desiring to have a 
more direct influence on educational policy with increased knowledge 
about the school system and the political influences that play a role 
in policy making. 

Individual Project Results 

Seven samples from four different projects sponsored by the University 
of Arizona (UA) were included in the analysis of interim effects. The 
distribution of these evaluation samples in terms of cohort, outcome, and 
project is as follows: 



Cohort 



First-Year Effects 



Second-Year Effects 



IK 



lEF 
IIEF 



(project d) 
(project c) 
(project c) 



(projects a & d) 
(projects b & c) 
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Project UA(d) 



Table 24 presents the analysis data and results for Project UA(d). 
This project is located in a non-urban area in the mid-Atlantic region. 
Nine FT and five NFT classrooms were included in the analyses. These 
FT-NFT classrooms were quite dissimilar; FT classes had a more even 
distribution of boys and girls, a higher proportion of Blacks, a larger 
proportion of pupils with preschool experience, more poverty eligible 
families, and fewer employed parents and heads of households than NFT 
classes. Moreover, the FT classroom averages were below the NFT class- 
room averages on all cognitive baseline tests. The baseline data on 
parents differs similarly, 

Covariable data for the teacher analyses (nine FT classes, three 
NFT), show that FT teachers reported more book resources, more helpers, 
and more years of experience ^han the NFT teachers. NFT teachers, how- 
ever, appeared more closely tied with the school community. 

Analysis of covariance on outcomes at the child level failed to 
reveal any statistically significant effects. Although the adjustments 
for covariable bias were pronounced (Compare unadjusted and adjusted 
results), all confidence intervals cross zero (change signs), indicat- 
ing nons ignif icance . ^ This project showed the least progress of any on 
language and reading measures, both on unadjusted and adjusted prepost 
comparisons. 

Of the four outcome variables at the parent level, only the parent/ 
school interaction measure shows significance. The 95 percent confidence 
interval for the mean difference (adjusted) between FT and NFT on the 
parent/school interaction scale is between .05 and 1.3^ jnits in favor 
of FT parents. This result is interesting since parental involvement was 
below average in this project — one PAC member per 20.4 pupils. 

Teacher results failed to reach significance. Since the project 
was not included in the classroom observation sample, little beyond the 
differences noted on input measures can be said at the teacher level. 
The impact of FT in this project appears negligible at the end of the 
second year of implementation. 

This project was among the CI subset for which first-year impact 
data were also available and analyzed. The results of this analysis 
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are summarized in Table 25. These results show that none of the pupil 
outcomes reached significance. However, a decline between first-year 
and second-year results is evident. Whereas FT students showed a con- 
sistent, though nonsignificant, advantage at the end of the first year, 
the opposite was true at the end of the second year. This trend suggests 
that the program is not having the desired impact on pupil performance. 

Project UA(a) 

The evaluation data for project UA(a) , also a Cohort I-K project, 
are summarized in Table 26. This project is located in a large city 
(population 1.5 million) in the south Atlantic region. The project is 
predominantly black, made up of underemployed, very poor families with 
few years of formal education. The background characteristics of the 
five FT and the seven NFT classrooms are highly comparable. Baseline 
test averages of the two groups were nearly identical and, except that 
more of the FT pupils had had preschool experience, the groups appear 
quite similar on all child variables. 

Parents and teachers seem reasonably well matched on most variables. 
One exception is that a higher proportion of FT parents favorably evaluat 
their child's academic progress. FT teachers reported having more books 
and more classroom helpers, more freedom to choose teaching assignment, 
and more teaching experience than NFT teachers. 

Outcome analyses for pupil measures fail to reveal significant pro- 
gram impacts. Moreover, the relative differences in the pretest and post 
test scores, both adjusted and unadjusted, are nearly identical on every 
outcome measure. Thus, there is virtually no evidence of differential 
effect due to the FT program. Similarly, analysis of par^^nt variables 
failed to display any significant FT/nfT differences, though all measured 
differences favored the FT group. 

The one significant outcome for this project was the impact of the 
program on teacher approval of classroom procedures similar to those used 
in the FT classxooms. FT teachers showed more approval than Ni?'T teachers 
at the 95 percent confidence level. Adequate interpretation of this 
result would require detailed process descriptions which are unavailable 
since classroom observations were not made in this project. 



119 



I 



t 



o 



CD 

f 



O 
CO 



CD 

rH 
I 



CO 



1 



CO 



o 



00 
CO 



I 



00 



in 
I 



00 



in 















in 


O 


rH 




















M 


cn 






rH 






rH 




Q 






















00 


CD 




CD 


C) 


O 


o 




00 


lO 




CO 


CD 


CD 


CO 


eg 




CD 


rH 


rH 




eg 




in 


rH 










lO 


in 




O 


rH 


H 


rH 


CD 












eg 




O 


rH 


rH 




eg 




in 


rH 




rH 

















8 V 



I 



I 
I 

I 

i 



CD 
I 



CD 



CM 
I 



CD 
I 



H 

MS 

X 

o 
o 





00 


rH 






00 


CD 


00 




CD 




00 


00 






CM 


o 


rH 


rH 




CM 




in 


rH 


rH 
















00 


CD 


00 


in 




in 


CM 


CD 




CD 


CO 


CO 


CD 




00 


rH 


CD 


rH 


rH 










rH 



Q 



^3 



CO 
M 
CO 
CO 
M 
CJ 
O 



a; 
O 
O 



CO 

:^ 

CO 

o 

a; 



CO 

o 



rHh-O inCCinCOOCOO^rHCDrHin 

^co I OrH icoooint^CMcocDinio 

I Y * tH rH CO I CM rH Y * 

rHOOh-CM^h-CMCMrHOrHlOCDCDCO 

inCMin^OintDin^oo'cDOCMCMcoin 

CO rH rH 00 CD . CO m tj« tj< tj< 00 

t OrHOOh.h.CMorHTj<a)r-cDCMinoo 
aiOiincooointDrHcofocot^incDooo 

rH CM rHOOinCDh-CDCOCDlOCDCD 




^ fe^ fe?. fe^ ^ 



X3 



C 
O 
■H 
-P 

Ri 
c 
d 

rH 

o 



o 

■H 

4J 
Cj 
4J 
O 

a 



0) 



o 

CO 



ERIC 



120 



i 

i 



I I 



O 

O CD 



in to X ;o 
|C o o d «-< M 



I I <N i 



00 ;d rj n m N 
00 n IS w N o 



O 00 50 in P3 

N X n O <B 



W o o o o 



t-i d d 



«3T".-«kn'.I)COOOOh-MOh-COOI 

h-Xh-inQ~)i>.inin>noooonoNOi<fi 



n <i] o 



X f N O 



O o ^^ . 

m <n » m so ad I 



» m n 



i ^ a 

b: J O 



° 5 



5 g 5 

c o cc 



J o 



g S3 



Q 



bi Pu ^ a a: tz 



2 ^ S 



i: 

C/^ C/^ U3 C/^ I 

^ ^ I; ^ 
s s a i 
< < < < 



3 9 



X J ta 



8 5 5 



2 . 



> 

55 < t*i t-^ 



3 u 



to U o o 

:3 S 5 

U U) CO 

« > O O 

H H X X 



M :4 (S cc: 



, ^ ^ § § 

gj W O M U 

ffi S cu s s 

6? t'' fcS 6f 



w i?: < J ^ 



121 



Project UA(b) 



The results for Project UA(b), an entering first grade project, are 
displayed in Table 27. This very large project i(1050 pupils) is located 
in a major urban area in a west-south central state. Eighty percent of the 
families participating in the FT classes were Black. Evidence regarding 
the similarity of the comparison group is mixed. NFT classes had a higher 
percentage of boys than FT classes, whereas FT parents tended to have less 
education, lower occupational levels, less current employment, lower in- 
comes, and fewer male heads of household than NFT parents. The FT group 
had only slightly higher preschool participation rates than the NFT group. 
FT teachers were slightly less satisfied with their jobs, more likely to 
be Black, less likely to be resident within the school community, and more 
likely to have an assistant or classroom helper than NFT teachers. 

I ( 
No pupil outcome measures showed significant program impact. However, 

all Ft/nfT adjusted differences indicated a modest positive FT program in- 
crement. (Note that the attendance measure is actually an absence rate; 
thus a lower value is favorable.) Since sponsor level averages were used 
to estimate classroom data, analysis of effects actually represents the 
regressed estimate of the model's impact on a sample displaying these popu- 
lation characteristics. ThuSj it is estimated that this FT program, which 
was anticipated to cost an average of $996 per pupil, would produce small, 
statistically nonsignificant gains over a comparable sample of pupils with- 
out the program. The strongest gain would be in reading skills. 



Parent impact data also failed to reach significance for this project. 
The one difference that approached significance (95 percent confidence 
interval = -.09 to 1.13 units) was for parent/school interactions. Also, 
the pupil/PAC ratio of 15.9 suggests that, on the average, slightly more 
than one parent per classroom participated in PAG- 
ET teachers responded significantly more favorably to FT-like pro- 
cedures (i.e., gave a positive evaluation to classroom practices) than 
NFT teachers. The confidence interval indicates a .95 probability that 
this true effect is somewhere between .07 and 1.37 scale units. Since 
classroom observation data were available for several classrooms in this 
project, averages on Factor Scores are presented. These averages show 
.cT as most different from NFT on the expressive and the self-regulatory 
factors. These factors best describe the Arizona model and suggest im- 
plementation is taking place according to sponsor goals. While FT classes 



Because of a redefinition of K and EF distinctions, this project sample 
'h*ad been administered an inappropriate level of the test battery, re- 
sulting in exclusion of the data from analysis. 



M 



a 

2 



m 



^ I 

ca ^ 

[4 



< W 

t- o 

S H 



03 

I 



00 n 













VI 
























« 




< 












< 




































(J 




















D 




s 
















< 




< 
X 




^ 














o 




<Ji 






Oi 


cn 


(D 


o 


<n 


o 


O 


If) 




lO 


O 


I 




n 


T 


l> 

CM 


10 


in 




» 


30 




05 




N 


(N 


P4 


in 


o 




U) 


o 


*o 

«> 


00 


l> 


(C 
(0 


to 


CO 




CM 

tji 


in 






K) 






n 


r>' 


C ; 


n 


SO 


1/3 






to 


Oi 
Oi 


T 


00 








t- 


T 

o> 


in 
to 


00 



<r, c tc o 



00 (O ^^ !0 c^i :^ I 



c « 0> n 



r*^aor»MOitotf3 



28 



CM ro o ^ 



o T in r- 

X M o o 



CO cc 

, 1 S w 

It- u u ca 

1 \ U cc OS O 

) en cc p< o (J b: . 



.IS 



i § c 9 
: ^8 S 



w p, Es « X 

t3 3 w cn u 

: fi w 10 0} 

CJ '-'<■< ui 

u ^ K 

w u o o K 



■ o w < > 



CO to 



pu 0. 0. pi. X 



8 g ' 



3 I ' 



< < fee tJ4 u« b« t« he t< lt»i 



I M ^ 
« < £ 

s i a 



: 2 u 

i w B ? 

W W 1>J 



fc*- t** t^: t"- i". fc*-. i*. 




ERIC 



123 



are higher than NFT classes on the programmed academic dimension, they 
appear to have little moie than the average amount of programmed academic 
activity. This may, in part, account for the lack of strong results in 
the traditional academic achievement variables and further suggests that 
those tests do not adequately evaluate this model. 

Proje-t UA(c) 

The remaining University of Arizona project included in this interim 
evaluation is UA(c). This project, for entering first grade pupils, has 
data for analysis of first- and second-year effects in Cohort I and for 
first-year effects in Cohort II. 

Project UA (c) is characterized as predominantly white, moderate in 

size, and located within 20 miles of a SMSA of 120,000 residents inl the 

south Atlantic region. The anticipated per pnpil expenditure of $910 is 

slightly above average, and t.he pupil/PAC ratio of 7.9 is well above 

average. Classroom observation data were collec;ted for classrooms in 

both the C and CII samples. 
J 

The results of the ai^aly^^is of CI-EF two-year program effects are 
presented in Table 28. These results are based on four FT and four NFT 
classrooms. The quality of the FT/nFT match on this project is considered 
poor. FT classes were below NFT on all cognitive baseline measures. In 
addition, the FT classes comprised higher proportions of black pupils and 
greater percentages of pupils with preschool experience. But with the ex- 
ception of the ethnic «nd preschool variables, the FT and NFT families 
appeared relatively comparable. That is, parents were nearly proportion- 
ally equivalent on education (low), skilled occupations (low), employment- 
(relatively high), impoverishment (relatively low), and male heads of 
household (high) . FT parents also tended to respond more favorably to 
the child's academic progress than did NFT parents. 

FT teachers and NFT teachers were quite dissimilar. FT teachers 
reported over twice the resources (books and helpers) as NFT, but NFT 
teachers were more integral to the communities, more experienced, and 
reported more flexibility in choosing assignments than FT teachers. 

Analysis of outcomes failed to reveal any significant program im- 
pacts in any of the pupil outcome variables. Moreover, relatively large 
deficits in achievements (reading, in particular) were evident and ap^ 
proach significance. Parent impact analyses show a significant difference 
on sense of control; scores of FT parents are from .2 to 1.55 scale units 
above scores of NFT parents at the 95 percent confidence level. Teacher 
outcome differences fail even to approach significance. 
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Howeve^", classroom observation factors indicate that the sponsor* s 
model is being implemented in part. Like project UA(b) we find this proj- 
ect considerably above average on the self-regulatory factor. In addition 
the child self- learning factor Jicore appears well above average. This 
factor also reflects sponsor goals. The Arizona model encourages children 
to become more self sufficient, evaluate problems, and express themselves. 
Again it seems that standard achievement lasts do not adequately evaluate 
interim changes in children in these models. 

Comparison of these second-year results with CI-EF first-year data 
(Table 29) for the same pupils show that, if anything, performance decre- 
ments increased from 1970 to 1971. For all achievement measures except 
language skills, the two-year trend favors the NFT pupils. Also, the one- 
year difference in affect (95 percent confidence interval = . 8 to 5 8 units) 
did not recur in the two-year data. 

The Cohort II, one-year sample (Table 30) appears to be well matched . 
with the NFT comparison group on all variables except baseline test scores, 
where FT consistently averaged above NFT. An inspection of the table re- 
veals that there were no FT/nfT differences on pupil outcome measures, ex- 
cept for the significantly betv.er affect scores noted for the FT classes. 

Neither parent nor teacher results reach significance. This lack of 
significance may be related to somewhat limited implementation, as shown 
in classroom observation data. The profile of factors scores for Cohort II 
FT classrooms is similar to that for the NFT comparison group, and scores 
on factors consistent with the sponsoijj s model (in particular, self- 
regulatory and self-learning) are generally below scores of other FT 
classrooms implementing the same mod^ 1 . 

Summary 

Separate summary analyses on University of Arizona data are reported 
only for Cohort I-K f^nd Cohort I-E, two-year results, for each of which 
two or more project samples were included. Sponsor summaries based on 
single projects are not r^^peated in this section. 

The salient features of this FT approach are surrii^arized below: 



126 









O 




o 




< 


M 




> 




Q 


OS 




>~f 


» 




NF 


NT 




O 




a 


o 







H 


00 


CD 


in 




in 


H 


CO 


1 




o' 




H 


I 




H 
1 




00 


O 


CJ) 


Cvi 


cn 


CO 


CO 


H 
1 




1 


00 

1 


CD 
1 


1 


J 


1 


00' 




CD 


CJ) 
00 




CD 


H 


00 




H 


CO 


OI 


H 




CO 






CO 


00 


<N 


00 


<^ 


H 


0) 


O) 

1 


CO 




CO 
1 


CO 

1 


H 
1 


1 


1 


CD 


in 


o 








CJ) 


H 


H 


^ 

H 


CO 
H 


CD 


CO 


CJ) 


H 


CJ) 
H 




00 


00 


in 


CJ) 


00 


00 


cvi 


H 


H 


lO 
H 


H 

CD 


CO 
CO 




CJ) 
CD 


CD 
H 


ID 




O 


CJ) 


o 




O 


in 


00 
H 
I 


H 


CD 


1 


CD 

I 


H 
1 


00 
J 


CO 

1 


0) 


CJl 


H 


CO 


(N 


CD 


o 


H 


(Jl 
H 


CD 
H 


00 


H 


H 


0) 


CJ) 


O 

c\l 


CO 


CO 










o 


CO 


H 

CO 


"0 
H 


r ( 

H 


CO 
CD 


in 

CO 


00 


H 


CD 
H 



CJ 



> 



J3 



03 
CO 



o 

O 



1 

CO 



0^ 



H 

o 
U) 



O H 
I 



CO ca in 
I 



o 



00 

I 



CO CJ) ^ ca 

I ^ CD CD 



CO ^ I H CO 



CD Ol O 00 
CO 00 



CJ) 

o 



NHOOCDOCOCOCJ) 

in ^ CD 

ID H H O) 



G> CD in (3) 



ocaoooomcocaoom 

CD CO 00 H CD 



cDinmcDiNOCDt^incoocoooint* 



O) CD ca 

ca in H H O) CD 



^ ^ O) 
00 00 N 











8 


















CJ 


p 







8 



W C5 



5 g 

Q p 



Q W 
O C5 



3 



CJ CJ 



o 

CJ 

o 

Q 

M 

CO 



H 05 H ^ 
^ ^ 00 00 



2 a 



Pu 



CO CO 
PU PU 



9 
S 

§ s 



S 

CO 

s 

PU 

fe^ fe^ b% 



127 



O -I » 



C/3 >-i 



2 £ a 



H O 



o ^ 
OS x; r: 

C it; o 



8 ^ 



8 g 
S 2 





01 


T 






<5 


c 






DAT; 


n 




lO 




X 


o 






lO.N 




















n 


in 






c 


m 








t- 


2 




o 




X 




o 
















r- 


M 


W 


















9 


X 








to 


tO 


co 




u 


cs 






30 






<o 


?' 


H 


01 


XI 
10 


o 
o 


60. 1 


X 


m 
o 


(O 


lO 
X 


< 
















C^J 


m 


in 








m 






QO 


01 


o 


» 




o 


o 


o 
PI 



M o y) o O ^J 



O CM n n o 
n n «T rH — 



'yj W 



OMCnWPlXOr' 

rH ,-0 T* rH rH X t' CI 



6 

a 



CO H 



n m n n r- O 



o fH n 



CI in in r» 



lO ^> s 
T 'T' in 



lT lO PI 
T 01 to ■ 



C'?inxtoci;^ir^ 



X O in O O O O 

W rH in rH rH rH 1* 



P5 X r- o 



oiNcjxcinnro 



n in lo 



rH X m N 



Ocjxocsr^nror: 



01 o n r- O 

rH ^ rH rH CI CD 



w a 

8 w 

cn w 



^ CL. q: u CL. 



• a. Vr. < 



^8 



H 3 



|3 



U ^ 
U 



K =? ^J * 35 



8 8 

a: a: 

W CO 



^1 

a 9 



6 CO CO CO 71 O O 
£ H H H h = 3= 
U y: y. 

uu(xa.a.a.o,xx 

b*" L*-- L^- Ij« li* 6>i V*: 



31 



. 9 



W Q 

a: >3 u 

a. a. J 

2 a a 

CO O O 

= ^ = n 



c J < ^ 9 o 

Kr-iwuJosa;""*' 
' U O U Q Q 

Ococo'tS><< 
o J w o w w 



Cay: 

CO V. 



CO w cz: 

g g e 

& i ^ 

U L> ^ 

U. ^ <^ 



< Q o a 



ERIC 



128 



■Focus & Object ives --emphasizes intermediate objectives 
Child 

Cognit ive 

Develop language competence 

Develop skills underlying all academic performance, 
such as the ability to: 

Attend 
Recall 

Organize behavior toward goals 
Evaluate alternat ives 

Develop reading, writing, and math skills 
Af feet ive 

Develop positive attitude toward school and learning 
Increase expectation of success 

Develop social skills of cooperation and planning 

Curricular Approach 

Teacher's role that of director 
Reinforcement primarily from teacher 

Emphasizes persistent adult-child interaction on 1-to-l ba ji 
in small heterogenous groups 

Provides variety of behavioral settings, including both self 
selected and structured activities 

Type of Parent Invovlement 

Inform parents about program 

Encourage parents to work in classroom as volunteers 
Encourage parents to work with PAC 
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Results of analyses of the second-year, Kindergarten entrance samples 
are presented in Table 31 and second-year first grade entrance samples in 
Table 32. These results show that the University of Arizona model has not 
produced identifiable impact on pupil outcomes after two years. Similar 
results were noted for first-year data (both K and EF, CI and CII) except 
that FT pupils seemed to show greater positive affect following one year 
in the program/ It thus appe£.:^s ^:hat the model has not attained many of 
its cognitive objectives. However, it has met with partial success in 
attaining noncognitive objectives. 

Analysis of parent outcomes present a somewhat more favorable evalua- 
tion for the impact of this model* In the Kindergarten projects, signify 
icant effects occur on both the parent/school and academic expectation 
measures. In the entering first-grade samples, results of parent out- 
come analyses show that FT parents are significantly more involved and 
have a stronger sense of control over educational activities than NFT 
parents. Also, this sponsor level analysis (entering first grade) shows 
FT teachers as significantly more approving of their methods than NFT 
teachers . 

Apparently then, this model has been reasonably well implemented in 
at least some projects, is having its intended impact in generating parent 
involvements in schooling, and is producing occasional evidence of other 
desirable impacts, such as teacher approval and parental confidence. At 
this interim point, it appears lacking primarily in strong evidence of 
positive impacts on the child. 
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BANK STREET COLLEGE OF 
Bank Street 



EDUCATION APPROACH 
College 



Sponsor's Intended Approach 



Basic to the Bank Street approach is a rational, democratic life 
situation in the classroom. The child participates actively in his own 
learning and the adults support his autonomy while extending his world 
and sensitizing him to the meanings of his experiences. The teaching 
is diagnostic with individualized follo.vup. There is constant restruc- 
turing of the learning environment to adapt it to the special needs and 
emerging interests of' the' children , particularly their need for a positive 
sense of themselves. 

In this model academic skills are acquired within a broad context 
of planned activities that provide appropriate ways of expressing and 
organizing children's interests in the themes of home and school, and 
gradually extend these interests to the larger community. The classroom 
is organized into work areas filled with stimulating materials that allow 
a wide variety of motor and sensory experiences, as well as opportunities 
for independent investigation in cognitive areas and for interpreting ex- 
perience through creative media such as dramatic play, music, and art. The 
cognitive areas of primary concern are the capacity to probe, to reason, 
and to solve problems. Teachers and paraprof essionals working as a team 
surround the children with language that they learn as a useful, pleasurable 
tool. Math, too, is highly functional and pervades the curriculum. The 
focus is on tasks that are satisfying in terms of the child's own goals 
and productive for his cognitive and affective development. 

Bank Street supports parent involvement in each community by pro- 
viding materials interpreting the program and special consultants , as 
well as by joint planning for home-school interaction. Parents partic- 
ipate in the classroom , in social and community activities related to 
the school, and as members of the local Policy Advisory Committee. 
Parents may receive career development training with either graduate or 
undergraduate credit. Parents and teachers pool their understanding of 
each child's interests, strengths, and needs as they plan his educational 
experiences in and out of school . 

Staff development is an ever-evolving process for administrators, 
tqachers , paraprof essionals , and local supportive and sponsor staff. It 
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is conducted both on site and at the Co*Llege. Programs are geared to 
the specific needs of each project and are guided by a sponsor field 
representative familiar with the history and dynamics of a given community 
in cooperation with local staff. Self -analys is is stressed in both the 
teaching and administrative areas. Bank Street's 50 years of experimen- 
tation as a multidisciplinary education center has demonstrated that a 
flexible, child-oriented program requires more, not less, planning and 
study. Staff development aims at providing a repertoire of teaching 
strategies from which to choose on the basis of the adult*s increased 
understanding of individual children . 

In moving from the broad , conceptual framework to the specifics of 
implementation, Bank Street supplies diagnostic tools for assessing child 
behavior, child-adult interaction, the physical and social milieu of the 
classroom, and the totality of model implementation. These instruments 
are used by trained observers and in self -analys is to increase model ef- 
fectiveness and stimulate joint planning of changes needed in the class- 
room and in teaching behavior, community relations, parent involvement, 
and administrative practices. 



In addition to continuing services on site, Bank Street develops 
slides, films, video tapes, and other materials for adult education. 
These supplement the materials developed for use in the classroom, such 
as the Bank Street Basal Readers and Language Stimulation Materials. 
Field representatives, resource persons, program analysts, and materials 
specialists meet weekly with the Director of the Bank Street program to 
share experiences, continue conceptual development of the sponsor's role, 
and to plan institutes and workshops differentiated on the basis of re- 
quirements of specific communities and participants. 

Individual Project Results - 

Eight samples from five different project':? of Bank Street College (BC) 
were included in the analysis of interim effects. The distribution of 
these evaluation samples in terms of cohort, outcome, and project is as 
follows : 



Cohort 



First-Year Effects 



Second-Year Effects 



lEF 



IK 



(project d) 
(project c) 
(project d) 



(projects a, b, c, and e) 
(project d) 



IIK 



IIEF 
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Project BC(b) 

Table 33 presents the analysis data for Project BC(b) — a Cohort I-K, 
two year effects sample. This project is located in a mixed ethnic com- 
munity (25 percent Black) within a large eastern megalopolis (population 
SMSA = 16,207,000). It was moderately small (500 pupils) with an antici- 
nated per pupil expenditure of $808 and about one PAC member per 20 pupils. 

On the basis of the control variables, the seven FT and six NFT class- 
rooms appear reasonably well inatched. They are nearly equivalent on base- 
line measures, and the FT families appear only slightly more disadvantaged 
than NFT. Also, teachers in the two samples appear comparable except on 
experience, aid, and autonomy. 

Outcome analyses show NFT s ignif ica" tly above FT on the affect mea- 
sure (95 percent confidence interval = 1.1 to 4.1 units), WRAT score ^ 
(95 percent confidence interval = .6 to 16.2 points), and reading skills 
(95 percent confidence level = .5 to 14.7 points). Other measures also 
favor NFT but do not reach significance. Differences in parent outcomes 
were not significant. NFT teachers scored significantly higher on parent 
image than did FT teachers. 

The net consensus of evidence for Project BC(b) is unfavorable to the 
model. On all measures, NFT groups scored either significantly better or 
slightly better than FT groups. However, both FT and NFT averages in 
general appear higher than other Bank Street Cohort I-K projects in this 
interim sample. No classroom observation data are available to provide 
clues regarding whether or not the model was well implemented. Since the 
FT and NFT samples appear reasonably well matched and since some FT- 
favoring covariable adjusting does occur, these outcomes must be considered 
as reflecting poorly on the impact of FT as implemented in this project. 

Project BC(e) 

Table 34 presents the analysis data for Project BC(e), which is also 
a CI-K, two-year effects sample. Located within a moderately small SMSA 
(371,000) in the south Atlantic region, the project is within a racially 
mixed community (44 percent nonwhite) , is moderately large, and had a 
rather high anticipated per pupil expenditure of $1140. On the average, 
there was one AC member per 16 pupils. 

*• 

The six FT and six NFT classes are moderately well matched in terms 
of pupil baseline test scores and classroom composition, but the groups 
differ widely on preschool experience (FT = 94 percent, NFT = 0 percent). 
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Also, as was often the case, FT families are more disadvantaged than 
NFT families. Teachers, however, appear quite closely matched on all 
measures except choice of assignment, where, in contrast to most projects, 
NFT teachers report more autonomy. 

Results of outcome analyses (Table 34) show NFT pupils significantly 
above FT pupils on the affect measure with no other pupil differences 
approaching significance. The difference on the parent child interaction 
measure is significant in favor of FT parents. Also, FT teachers show 
significantly more approval and acceptance of their methods than NFT 
teachers . 

These Project BC(e) results are mixed and perplexing. Since this 
project was not included in the classroom observation sample , descr ipt ions 
of processes are not available to assist in interpretation. However, it 
does seem clear that this project is failing to attain at least one major 
goal of the model — that of developing positive pupil affect. On the 
other hand, the parent and teacher goals are being attained to some ex- 
tent, as evidenced by the significant results in these areas. 

Project BC(a) 

Data for Project BC(a), a Cohort I-K sample, are summarized in 
Table 35. Even though this project only marginally qualifies for inclu- 
sion in the analysis and evaluation, we are including the findings for 
purposes of completeness. 

The project is small (240 pupils) and suburban to a relatively small 
SMSA (63,000) in the New England region. The community is 99 percent 
white, the projected FT expenditure was $862 per pupil, and the PAC 
ratio was one member per 10 pupils. 

The FT and NFT groups a-re badly matched; pupils in the three 
FT classes average about nine mc.iths older than these in the two NFT 
classes. Nevertheless, the FT group sco^^ed only slightly better than 
the NFT group on baseline tests. Ihe NFT classes are nearly 75 percent 
girls, whereas the FT classes are about 50 percent female. Although 
data problems prevented parent and teacher analyses, parent data are 
summarized for the pupils. As can be seen, FT families are substantially 
more disadvantaged than NFT families (less education, lower occupational 
levels, lower income, less employment, and fewer male heads of household). 

Results of the outcome analyses for this project show that FT scored 
significantly higher than the NFT pupils on the cognitive process 
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measure (95 percent confidence interval - .2 to 3.2 points). Although 
no other differences reached significance, a large NFT-favoring difference 
on the WHAT approached the 5 percent confidence level. The evidence of 
positive effects of this FT project is considered marginal. 



Project BC(c) 

Project BC(c) is a relatively small project (292 pupils) located 
within a moderate sized SMSA (601,000) in the mid-Atlantic region. The 
anticipated FT expenditure is relatively high, averaging $1,250 per pupil, 
with a pupil/pAC ratio of 8,6. Two cohort samples were gathered from 
this project — I-K (two year) and II~K (one-year). Both of these samples 
were included in the classrobm observation activities. 



The analysis data for the Cohort I, two-year effects sample are 
summarized in Table ^6 . These data show that the five FT and four NFT 
classrooms were reasonably comparable on baseline test scores, but not ^ 
on classroom compositions. FT classes had higher proportions of male 
pupils, lower proportions of Blacks, and a much higher proportion of 
pupils with preschool experience than the NFT classes. The families, 
however, were moderatelj^ similar except that the FT families tended to 
be somewhat more disadvantaged (lower relative educational level, income, 
occupational level, and employment, and fewer male heads of households). 
Teachers of the two groups were also moderately alike on inputs although 
FT teachers were less sat isfied^ with their working conditions ^ had fewer 
book resources, less training and experience, less choice of assignment, 
and fewer helpers than NFT teachers* This profile is quite unusual, 
since in other projects FT teachers tend to exceed NFT teachers on many 
or aH of these variables, ^ ^ 

In the outcome analysis, FT pupils scored significantly better in 
quantitative skills (95 percent confidence interval = 3.6 to 15.0 points). 
No other pupil results reached significance, although the FT pupils 
tended to do better. 

None of the results for parents or teachers were statistically 
significant. On classroom observat ion factors , FT and^NFT share similar 
patterns of scores, but FT is consistently higher than NFT, especially 
on the self -regulatory factor, which is emphasized in the model. The 
pattern would correspond better with the Bank Street model if the 
differences on the self -regulatory and expressive factors were even 
more pronounced. But the five factors are not as salient for ^ais 
model as they^are for most other models^ (cf- SRI, 1972b, Appendix B) , 
Selected variables from several factors, such as small groups^ wide 
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variety of activities, reinforcement, use of objects, independence, and 
self-expression, would more nearly describe the Bank Street program. In 
any case, the pattern of process factors does not provide assistance in 
the interpretation of the single instance of FT superiority on the quan- 
titative outcome measure. 



Project data from the Cohort II sample in Project BC(c) provide 
additional evidence of FT/NFT lack of comparability. These data, pre- 
sented in Table 37 show that the four FT and three NFT classes are poorly 
matched on ethnic compos it ion . Moreover, NFT f amil ies appeared more 
disadvantaged than FT families as defined by the conventional indi- 
cators of education, occupation, and poverty level, although the heads 
of household were more often male and more often employed than the FT 
heads of household- Insufficient teacher data prevented analyses of 
teacher effects, but process data were available since the classrooms 
were included in the CO sample. 

Analysis of one-year effects for this sample failed to reveal sig- 
nificant FT/NFT differences on any of the pupil or parent outcomes. How- 
ever*, the FT pupils did somewhat better on all tests (adjusted scores) , 
and on the quantitative measure, the difference between their scores and 
those of the NFT group approaches statistical significance. Process 
factor score averages show FT above NFT on the self -regulatory and child 
self -learning factors but close to NFT on the expressive dimension. The 
low expressive factor score for FT and the lower score for FT than NFT on 
the programmed academic dimension give a mixed .picture of implementation 
of the model. The classroom observation scores do not serve to clarify 
our understanding of t^e pupil te >t scores for this cohort. 

Project BC(d) 




The data for the Cohort I sample are summarized in Table 38 for the 
two-year effects and in Table 39 for the one-year subset. A total of 
nine FT and four NFT classes were tested and appear highly comparable 
both on baseline test scores and classroom composition (wi>£h the excep- 
tion of preschool experience, which was much more prevalent among 
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FT pupils). Covariable data indicate that both FT and NFT families in this 
project were severely disadvantaged. The NFT parents reported less educa- 
tion, lower occupational status, and higher poverty eligibility than the 
FT group, although more NFT families had male heads of household and over- 
all employment of head of household was higher for NFT. Because of insuffi 
cient data, teacher analyses were not performed on this sample. 



Analysis of FT/NFT differences on pupil measures indicates that the 
program had a significant effect on quantitative skills. The 95 percent 
confidence interval for this result shows from 1.9 to 14.1 score points 
in favor of FT, Other differences failed to reach significance but the 
outcome on achievement came close enough (- .03 to 40 . 2 points) to warrant 
attention . 

Parent results showed that FT parents have a lower appraisal of 
their children's success opportunities than do NFT parents. Since the 
Bank Street model attempts to involve parents in both classroom ana 
PAC activities, this result is unexpected. Perhaps the FT parents are 
appraising their children more realistically, or perhaps their goals are 
higher than those of NFT parents. Previous studies of community involve- 
ment (Zurcher, 1970; Gurin & Gurin, 1970) make these explanations plaus- 
ible. In any event, the outcome needs further study. 



That the impact of this project is increasing becomes apparent 
when first year results are compared with second year results for the 
same children. This analysis* (Table 39) shows virtually no difference 
between FT and NFT groups on all test measures, and a near-significant, 
FT-favoring difference on attendance at the end of one year of FT, After 
two years, however, tests of these same FT and comparison group pupils 
show that FT pupils are stronger on cognitive variables (quantitative 
difference reaching significance) and, again, on attendance. 



This pattern of results can be explained at least two different 
ways. Either the model's effects gradually accumulate over time or the 
structure and implementation of the Bank Street program substantially 
improved between 1970 and 1971. Since Cohort II samples were measured 
in this project, we can determine which of these explanations seems most 
likely; if Cohort II, first year effects are stronger than Cohort I, first 
year effects, the improved implementation explanation would see^u more 
plausible. 

The Cohort II-EF results for Project BC(d) are summarized in Table 40. 
The baseline data indicate a moderately good match between FT and NFT class 
rooms on pupil scores, classroom composition, and family characteristics. 
The notable exceptions are the reading test score (NFT higher) , preschool 
participation (FT much higher) , and family leadership and employment 
(NFT more male household heads and higher proportion employed) . Teacher 
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characteristics were also quite similar with the only notable difference 
being the greater number of helpers for FT teachers. 

Results are very strongly in favor of FT for this one-year sample. 
Significant differences appear on overall achievement, on WRAT, on reading, 
and on language skills. Differences on affect and quantitative skills 
also approach significance in favor of FT. 

No parent effects reached significance, but teacher acceptance of 
FT approached significance at the 95 percent confidence interval (-.05 
to 1.91). Hence, the principal impact of this program appears concen- 
trated on pupil outcomes. 

Since the outcomes for the Cohort II, one-year group are much stronger 
than the outcomes for the Cohort I, one-year group — in fact, resemble more 
closely the results for the Cohort I, two-year group — we prefer the inter- 
pretation that the project was better implemented in 1971 than it was in 
1970. 



Summary 

The salient features of the Bank Street College approach can be 
summarized as follows: 

Focus and Objectives — emphasizes long range objectives via child 
self -development 

Child 



Cognit ive 

Develop competence in basic skills 

Develop ability to probe, to reason, to solve problems 

Affective 

Develop capacity for enjoyment 
Develop positive self-image 
Develop self-direct ion 
Develop expressiveness 

C urr i cular Approach 

Teach3r*s role that of facilitator 
Reinforcement pr imar il y from t eachers and aides 
Individual and f?mall group focus 
Wide variety of activities provided , 

Heavy emphasis on child self-expression and self-regulatory 
activity 
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Type of Parent Involvement 

Inform parents about program 

Parents participate in classrooms, PAC 

Parents used as resource for teachers in planning educational 
experience of child 

Evidence that the model is achieving its objectives is mixed from project 
to project , but some encouraging results were noted . jThese encouraging 
results include significant achievement gains in reading and language 
skills for entering first grade samples in Cohort II. Also, for the Co- 
hort I , enter ing f irs t grade sample a significant FT- favoring difference 
on quantitative skills occurred. However, these results are apparent 
primarily at the project level, since only for Cohort I, two-year data 
(kindergarten stream) is an across project summary analysis for this 
sponsor possible. The results of this analysis are summarized in Table 41 
and show overall significant differences favoring the model on cognitive 
processes and on parent-school involvement measures « Teacher differences 
approached significance on professional acceptance of FT, and reached 
significance (NFT favoring) on the parent image measure. These findings 
indicate that for the Coh rt I sample, the Bank Street College model has 
met with reasonable succe_>s in attaining its objectives of pupil gains 
and parent involvement. FT teachers, however, apparently hold less favor- 
able attitudes regarding parent participation than do NFT teachers. 

Process data, which describe activities, i.e., the way the model 
was implemented, varied from project to project, indicating a high degree 
of variability in the fidelity with which the approach was implemented. 
Because of this variability, it is very difficult to formulate a complete 
evaluation of the interim success of this approach. As mentioned earlier, 
we can state that the results appear encouraging. 
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MATHEMAGENIC ACTIVITIES PROGP/VM (MAP) 
University of Georgia 

Sponsor's Intended Approach 

The MAP model emphasizes a scientific approach to learning based on 
teaching the child to make a coherent interpretation of reality. It ad- 
heres to the Piagetian perspective that cognitive and affective develop- 
ment are products of interactions between the child and the environment. 
It is not sufficient that the child merely i:opy his environment; he must 
be allowed to make^ his own in"Cerpretations in terms of his own level of 
development . 

An activity-based curriculum is essential to this model since it 
postulates active manipulation and interaction with the environment as 
the basis for learning. Individual and group tasks are structured ^to 
allow each child to involve himself in them at physical and social as 
well as intellectual levels of his being. Concrete materials are pre- 
sented in a manner that permits him to experiment and discover problem 
solutions in a variety of ways. The sponsor contends true learning 
cannot occur when tasks that exceed a child' s level of development are 
forced on him. On the other hand, a child is attracted and challenged 
to learn by tasks representing the next step beyond his current expe- 
rience and knowledge level. Both teaching techniques and curriculum 
materials emphai^lze sequential arrangement of tasks in small steps to 
create a stimulating discrepancy or "mismatch," 

Thus, the ma them age nic classroom stresses learning by doing as well 
as Jjiidividual initiative and decision-making on the part of the child, 
Ai^ attempt is made to maintain a careful balance between highly structured 
and relatively unstioictured learning situations and between the level of 
conceptual material and the capability of individual children; small group 
instruction by teacher and aides is emphasized but with specific provisions 
for individual activity. This results in a great vari^^ in the media em- 
ployed, the activities available to the child ,^ and in the social situations 
the child encounters, 

The classroom is arranged to allow several groups of children to be 
engaged simultaneously in similar or different activities. Teachers' 
man?ials including bo th^i'ecommended teaching procedure and detailed lesson 
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pla^s for eight curriculum areas (K-3) are provided in the model. Learn- 
ing materials also include educational games children can use without 
supervision in small groups or by themselves. Art, music, and physical 
education are considered mathemagenic activities of equal importance to 
'language, mathematics, science, and social* studies , Feelings of self- 
confidence and motivation to learn are viewed as natural consequences of 
the mathemagenic approach to learning . 

Sponsor assistance to projects includes assignment of curri ulum 
specialists to spend some time each month in continuous inservice teacher- 
^-^Ide training and a Project Advisor to coordinate the model with the other 
aspects of the Follow Through project, such as the Policy Advisory Com- 
mittee, supporting services, and home- school activities, Preservice work- 
shops are held during which teachers and teacher-aides gain experience 
using the curriculum materials i^nd learn how to implement MAP principles • 
Second^year teachers and aides are expected to assi4m^ leadership roles in 
these training workshops, and parents and the Policy Advisory Committee 
are invited to all sessions. Parents and Follow llirough staff work to- 
gether during the year in the overall efforts in home-school coordination 
and in encouraging the, local community to participate in the program. 

Evaluation is a continual process. Project staff participate jointly 
in evaluating the effectiveness of various aspects of the program and in 
recoimnending improvements. ^^valuative infonnation is used in program 
development and for specifying, in observable terms, important dimensions 
of the program. 

Indj.vidual Project Results 

Only one project, which became an MAP project in 1969-70, was avail- 
able for analysis of effects. This project sample consists of entering 
first grade pupils in Cohor"^ I . The project is relatively small (397 
pupils) , in a predominantly white (3 percent nonwhite) community located 
30 to 40 miles from a lavge urban SMSA in the south Atlantic region. 

The data for the analysis of this project are summarized in Table 42. 
Baseline values on pupils, parents, and teacherr indicate substantial lack 
,.of comparability between FT and NFT samples on many variables. Specif- 
ically, FT classes averaged below NFT classes on nearly all baseline tests. 
FT pupils also averaged several months older than the NFT pupils and, as 
is '^typical in this experiment, were more likely to have had preschool 
experience. The FT families tended to be more disadvantaged than KFT. 
Fewer FT heads of household had skilled occupations and were; fully 
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employed, and thus FT families were more likely to be impoverished by 
OEO standards. However, the actual employment rates and the proportion 
of famili.^ with male heads was quite high- 

FT teachers and NFT teachers were reasonably comparable on most 
variables. Only on book resources and number of helpers did notable 
differences occur, showing FT higher on each. 

Analysis of outcomes failed to reveal significant FT/NFT differences 
for any measure, pupil, parent, or teacher. Since classroom observation 
data were not collected for this sample, description of process components 
and differences are unavailable. 



Summary 

Since data from only a single project sample were available for 
interim evaluation of sponsor effects, ^ risk of faulty interpretations 
is considered very high. It does .appear that there is no clear evidence 
of a two-year program impact on this EF sample. However, Review of the 
salient features of the model suggests that thefse results could be ex- 
pected. r These features are: 



Focus and Objectives — long range program objectives 
Child 

Cognitive 

Develop ac\demic competence in many different areas 
Affective 

Promote feelings of self-confidence and motivation to learn 

Curricular Approach 

Teacher's role that of facilitator 

Reinforcement from teacher and activi ties 

Small group focus with provision for individual activi', y 

Balances highly structured and relatively unstructured activities 

Emphasizes sequential arrangement of tasks in small steps 

Type of Parent Involvement 

Minimal during period covered by I'eport. 
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The model does not stress immediate academic impacts or extensive 
parent involvement as do'many of the alternate approaches. The model 
places the toacher in a guidance role and appears to incorporate a 
Montessori-like concept of the child's learning from structured expe- 
riences, 111 this perspective, it is altogether possible that large dif- 
ferences on pupil measures would not emerge early in the child's FT 
experiences. Rather, effects of this model should occur on such non- 
cognitive factors as motivation, curiosity, exploratory behavior, and 
the like. Unfortunately, adequate measures of these traits do not cur- 
rently exist for use in large-scale evaluations. Further, this model 
was not implemented in this project until 1969-70, and hence by the 
eligibility definition of project inclusion (see p. 24) should not have 
been included in the evaluation sample in the first place. Thus, we n'.ust 
conclude that the evidence necessary t6 evaluate the University of Georgia 
model on its own terms is not available m these interim data. 
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UNIVERSITY OF OREGON ENGI LMANN/BECKER MODEL FOR DIRECT INSTRUCTION 

University of Oregon 

Sp onsor's Intended Approach 

The sponsors of this model insist that a child who fails is a child 
who has net been propex^ly taught and that the remedy lies in teachin<; the 
skills that have not been mastered. The model attempts to bi'ing disadvan- 
taged children up to the "nox^mal" level of achievement of their middle- 
class peei^s by building on whatever skills children bring to school and 
to do so at an accelerated pace. 

Using programmed reading, arithmetic, language, art, and music mater- 
ials and behavior modification principles, the model employs stratr-gios 
to teach concepts and skills required to master subsequent tasks oriented 
toward a growing level of competence. Emphasis is placed on learning the 
general case, i.e., developing intelligent behavioi', rather than on role 
behavior. Desired behaviors are systematically reinforced by praise and 
pleasurable activities, and unproductive or antisocial behavior is ignored. 

In the classroom there are three adults for every 25 to 30 children: 
a regular teacher and two full-time aides recruited from the Follow Through 
parent nommunity. Working very closely with a group of 5 or 6 pupils at 
a time, each teacher and aide employs the programmed materials in combina- 
tion with frequent and persistent reinforcing responses, applying remedial 
measures where necessary and proceeding only when the success of each ch i Id 
with a given instructional unit is demonstrated. At the same time, the 
teacher aides are working with other small groups throughout the classroom 
in a sim-Llr.r manner . Tra ining in implement ing the model .includes local 
summer workshops for all teachers and teacher aides and inservice training 
during the school year. 

Family workers, who are usually parents themselvc^, personnally con- 
tact all project parents to acquaint them with the program and teaching 
materials; inform them about their children's progress; and encourage them 
to attend Policy Advisory Committee meetings, visit school, and partici- 
pate in training leading to work in the school. Parent workers also in- 
struct parents in the use of materials to supplement the school program 
in the home and attempt to organize parents experiencing specia . dif- 
ficulties into problem solving groups. On occasion, they contact local 
social service agencies where spec4al,ji,ssi'stance is needed by individual 
families. 
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Evaluation is an ongoing part of the program. Tests arc administered 
at the beginning and throughout the year to determine if children are being 
taught the skills required by the model and at what rate. The tests are 
administered by parents especially trained I'or the job. Cpntinous test 
data provide a positive gauge of teacher performance and allow for timely 
remedial action when the program appears to be implemented improperly or 
students appear to be falling behind. Video tapes of teachers and aides 
executing training tasks are used both to determine and to correct specific 
difficulties, Bi-month]y repoits are issued to teachers report in<; the 
progress of ind iv idual ch ildren and classroom sumiiiar ier. . 

The pa rent Pol icv Act ion Commi 1 1 ee participates actively in the modc^l, 
focusing attention on the needs and interests of parents, recruiting paront 
aides, and assisting in writing the Follow Through proposa Iv^ jihe model is 
firmly committed to support a parent-community-school^|||g^rtr^ei^ship in tho 
operation of its program,' The sponsor feels pi^oject parcnfs mast have the 
right to judge the effects of the program for themselves, both to provide 
criteria of program success and to guide efforts at program improvement. 

Individual Project Results 

Eight samples from five different projects sponsored by the University 
of Oregon (UO) were included in the analysis of interim effects. The dis- 
tribution of these evaluation samples in terms of cohort, utcome and 
project is as follows: 



Cohort 



1st -year Effects 



2nd-year Effects 



lEF 
IIK 
IIEF 



(project d) 
(project a) 
(project d) 



(projects a, b, and c) 
(proj ects d and e) 



Project UQ(c) 

Project UO(c) is located within a racially mixed but predominantly 
white community within a small east north central urban area. The project 
is moderate in size (480 pupils) with an anticipated per-pupil expenditure 
of $694 and an unusually high PAC participation rate of one member for 
roughly every four and one-half pupils. 
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The Cohort I-K, two year data gathered on this project sample 
(Table 43) show that FT and NFT classrooms were reasonably comparable 
at baseline, with the FT group showing a slight advantage, FT pupils 
tended to average higher on pretest measures; recalling* that baseline 
testing occurred very late in Fall, 1969, this result may be due to in- 
itial program impacts in this project. The FT and NFT samples were es- 
sentially equivalent in terms of classroom composition and family char- 
acteristics. Specifically, most classroom samples were nearly evenly 
split between Black and non-Black pupils , and the few small differences 
between the groups on parent education, occupation, antl employment are 
considered negligible. 

Comparison of adjusted outcomes for these pupils shows that differ- 
ences between FT and NFT pupils fail to reach significance on any of the 
evaluation variables, A trend toward positive pupil impacts is suggested 
by FT-favoring differences on the cognitive measures, but inspection of 
confidence intervals indicates that conclusions cannot be justified at 
this point. None of the differences on parent measures reach significance 
although the parent/school involvement and expectations measures show 
differences in the desired direction. 

The teacher data for this project show that FT and NFT teachers were 
reasonably comparable in their satisfaction with their job, the amount of 
book resources, race, and closeness to the cormnunity, FT teachers, on 
the other hand, appeared to have more training and more classroom helpers 
than NFT teachers , but reported less freedom to choose their assignment , 
A:ialysis of program effects on teachers , which controlled for these dif- 
ferences , failed to reveal any significant outcome on the evaluation 
variables , 

Project UQ(b) 

The dat^ for the Project UO(b) Cohort I-K, two-year sample are pre- 
sented in Table 44, This project is relatively small (225 pupils) , 
located within a school district in a large eastern urban population 
center. The anticipated per-pupil expenditure for this project was $902 
and there was one PAC member for each 6.6 pupils, or roughly three PAC 
members per classroom , 

The pupil baseline test, classrOCji. composition, and family dat£ for 
the three FT and four NFT classrooms indicate a fair"" =;erlous problem 
in noncomparability for Viis project sample. Specif ^ NFT pupils 

systematically scored above the FT pupils on all baseline measures, and 
the FT classrooms were predominantly, if not completely. Black, whereas 
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NFT classrooms were almost totally non-Black. Furthermore, although the 

FT parents tended to be better educated, fewer were employed and fewer 

had male heads of household chan the parents of children in NFT classrooms. 

We believe that the bias introduced by this type of mismatch of control 
and treatment group seriously impairs the interpretability of any subse- 
quent results. That is, even though our statistical procedures are de- 
signed to adjust for differences in certain baseline properties, substan- 
tial population differences between the two subsamples increase the 
probability of differential regressions. Since this mismatch exists for 
both pupil .nd parent variables, we feel a more appropriate procedure is 
to avoid interpretation of results for this project. 

The teacher characteristics for the FT and NFT samples show that the 
teachers were reasonably comparable on most of the variables except those 
measurin^^ number of book, resources and number of helpers available for 
classroom assistance. On the whole, the FT teachers appear to be less 
well suited to the task.^than the NFT teachers ; they report less satisfac- 
tion and fewer resources , they are less close to the community , and they 
have slightly less training and experience. With adjustments for these 
differences, analysis of the teacher outcome variables reveals that FT 
teachers are significantly more approving of their method than are NFT 
teachers, a variable that assumes great importance to those who believe 
that the enthusiasm and sense of commitment maintained by the teacher 
will ultimately relate to the success of the program. This finding may 
also be taken as indirect evidence that the program is reasonably well 
implemented . 

Project UO ' i)_ 

Project UO(a) comprises a Cohort I-K second-year sample and a 
Cohort II-K, first-year sample. This project is moderately large and is 
located within an urban area within the east north central region. The 
project average was one PAC member for approximately 28 pupils and in- 
volved an anticipated per-pupil expenditure of just over $1000. 

The evaluation data lor the Cohort I-K sample are presented in 
Table 45. Values for the ten FT and six NFT classrooms on the control 
variables show FT pupils are systematically above NFT on the pretest 
measures and percentage with preschool experience. The FT classrooms have 
a better ethnic balance in this sample (NFT is nearly all Black) , but the 
FT families appear to be more disadvantaged than the NFT families. Thus, 
overall, the two subgroups appear only moderately comparable in terms of 
these baseline measures. 



160 



■A 




5 


Co 






:sT 












no: 


v: 


: OF 


H 

ui 
y 


jj 




o 



11 



u; M 



» 1^ N » n 



O) 00 ro X 



in o w to n 

h- M CJ iH to 



rH m m o o X 



ooin-ririiHaitnn 
M r~ tn rH c;^ in in 



to X o n f ci oi 



X n - LT r~ 



xtoifJNxc>Jon',o<rnoi<HXNOi 

f^i- •0'rtfJtoinNr~nt^iHtof »'5 0 
N N n iH X m lO X lO n 'n lo ,3 m 



X M Gl 



O O O O O O M 



iH m o m o O 
N n iH o M N 



•-^ < O* U as 



< H 
S3 U 
(J u 



g J 8 



3 3 



2 2 

« y 

» M u M 

- y rj - 



« U U o o 

■- - X r: 

U W W 



-i: t': t< ft"! t*. 



a: w o U3 x 



W Q 

tfcr M S 

« 2 2 a 

w a o o 

S3 -I = X 

>J W U 

(J U] M ^ 



\ p J w o u uj 

o > 

i', < 6° t» t". 



« to W 

^86 



S 3 W ^ ^ H 

S S . ? ,a 

g s ° a g 

i 2 2 g ^ ^ 



ERIC 



161 



^ or- 



y. 



-2 w 
j-1 a 



in N 00 o 



00 ■'T 00 m 



in »r C n 



ino«ONoiooinNOir-':^N 
Tj" in n ^ 35 



in m -Vr T in (30 »T o 



in C o N 01 



c) o CO o t' 
O »-t O O ri o o 



tn n N 00 n ID ( 

to 00 in n T to to I 
M iH 00 T 0^ r~ I 



h- h- 






in o r- o n n 
iH n o c N CM in 



8 

CO 



U to to 

to to " ■ 



^ 3 



u: U w O O 



to to 
to to 

" 2 



8 to to 

X H H 

U is 

i2 g S 



to to 



tK u] u u a. 

< < c*-- t« 



cH (s1 C.^- 



ROUl 
SSR! 
OL 1 
UP. 
HlLl 




o 2 o o o 

K u 5 8 tti 

o \ to o 

O to Q 

cs H 3: u ^ 

CO S', 13 2 < 


u >-■ 

to a 


tfl UJ J > 
2 S w 

J < w 

Cj D, O W CO 

• . ^ 5: S 
O > 

< L«i b->; fr^ 


% BLACK 
fo REPORT] 
%• POVERTY 



I ^3 

w 

9 9 

o o 

X s 

o o 

ac s 

a Q 



to u K 
ctf to E3 

K 8 e 



ERIC 



162 



I r I 



rl rH h- 

r-< X 
c c«i n c>i 



■ O 
•r IT. (D 

C " 



(DOC 



C 



1.T X C 

C T ^* 



8 2 



1^ 



5 tr 



M n I 



f m N -r » c»i 



C o in N 



o <M o c 00 m 
(O N ci «) u) 



C O) 



00 O <D u> 



o 

in T 



Ji r- 



« T X Cl M 



O lO O X lO CI (0 



c t- o 

»-< « C O .-I 



c«i lt in 
. } .-^ o c 



! 2 o 



u: OS J J 
u: u y o o 



Or O 



W ft 2 ct cc 



a u w 

< u, 

iMi < lU 

C J < 



U Q O 

ta «t < 

WW 



^ 3 3 S 

< li". tV I.'! I* L"- L'' l," 



o yj 



ui Vr; c -J 



U Oi O CiO 



2 ci} 



CO q q 

w J -3 

w a o o 

u u3 P 

O W to C/3 

> g 8 



. is is a. ea OS 

< L" t" L" L*- ; 



^ O CLi 



< ^ (-1 



3 - " 



ERIC 



163 



Outcome differences adjusted for these baseline, biases show sig- 
nificance only for the attendance measure, FT pupils have lower absence 
rates Uan NFT pupils. In addition, FT pupils scored higher on four of 
the six cognitive outcome variables. However, none of these differences 
reach significance and thus, we must interpret the outcomes as showing 
no substantial project effect on the pupils. 

Parent outcome measures indicate a similar lack df effect. In no 
instance do the adjusted FT/NFT differences reach significance. 

Comparison of averages^on the teacher variables show that the FT 
teachers reported a higher degree of job satisfaction but had fewer book 
resources than the NFT teachers. Furthermore, the FT teachers had more 
classroom help and more freedotn to choose their assignments but were some- 
what less experienced than the NFT teachers. The outcome measures for 
teachers show that FT teachers are significantly more accepting of their 
teaching methods than are NFT teachers. We should again note that this 
difference is important to the extent that the teacher' s approval of the 
teaching method influences the success of the program. 

The evaluation data for the Cohort II, one-year sample in this project 
are summarized in Table 46. Some very serious problems regarding the com- 
parability of the FT and NFT samples within this cohort project are ap- 
parent from this table. For example, the FT classes systemat|.cally aver- 
aged above the NFT classes. Furthermore, the data on classroom composition 
are highly suspicious, particularly for the preschool experience of NFT 
fapiis, which does not correspond to that reported by the parents, Firally 
NFT values on many of the parent background and poverty variables were 
imputed* for this sample. This imputation most likely seriously under- 
estimated the comparison group values, which would result in underadjust- 
ment or even adjustment in the wrong direction for FT/NFT differences. 
Because of this fairly strong evidence of data problems and the comparison 
group's severe lack of comparability with the FT group, we feel that inter- 
pretation of these parent and pupil outcome data would lead to unwarranted 
f nclusions. Any analysis and corresponding interpretation must be based 
on a more complete and verified data set. 



A discussion of imputation (i,e,, estimation) problems can be found in 
Annex A. Detailed rules for imputing scores for each variable in cases 
where data were missing are too extensive to be incorporated in this 
report, but they are part of the formal documentation for the analysis. 
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TJie teacher data obtained for this project do appear to be accura*;e. 
The specific values on the teacher variables indicate that the FT teachers 
were somewhat mere satisfied with their job conditions, had more book 
resources, were somewhat more closely tied to the communities in which 
they taught, had more helpers, had had more freedom to choose their assign- 
rents, and were slightly better qualified than the NFT teachers. Analvsis 
of outcomes indicates that neither of the outcome measures show signifi 
cant FT/NFT differences. 

I 

Project UO(e) 

The data for Project UO(e) are summarized in Table 47. This moderate 
sized project is located more than 75 miles from the nearest SM^A in the 
west south central region of the United States, had a low PAC membership 
of one PAC for each 42 pupils, and anticipated a near average FT expendi- 
ture of $757 per pupil. 

The pupil baseline variables show that the seven FT classrooms and 
the two NFT classrooms were far from comparable on prescores. ^FT pupils 
averaged below NFT pupils on nearly ail measures: and were 15 points below 
on the reading scores. However, the two samples do appear reasonably 
well matched on classroom composition variables. 



Parent variables were not available for control as covariables; their 
absence severely limited the interpretability of the outcomes of our 
anaiyr 3. As we have repeatedly noted, FT families tend to be more 
severely disadvantaged than comparison group families. Since indices of 
disadvantagement relate strongly to outcomes, they are essential for 
appropriate adjustment and interpretation of the FT/NFT differences. 



We do not attempt to interpret the results of an inappropriate 
analysis of pupil outcome data. We choose instead to limit our discus- 
sion to classroom observation data. All of the factors appear to be 
quite salient for describing classroom processes in this project. The 
factor score averages reveal a pattern consistejit with the model. In 
particular, the high score on the progrcn.iTied academic factor would be 
expected. It appears that the FT and NFT classes diffei con;>iderably xxia 
that the model has been implemented in this project. 



Project UQ(d) 



' Project UO(d) is located fairly far from the nearest SMSA in the 
cast south central region. It is a moderately large project in a ra- 
cially mixed but predominantly white community. Since,, the public schools 
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r 

do not offer kindergarten, the groups in this project are classified as 
entering first grade cohort samples. The project had an average of 18.2 
pupils per PAC member and an anticipated per-pupil expenditure of $735. 

Three sets of data were analyzed for this project — a Cohort I, two- 
year effects sample, a one-year effects subset for this same sample, and 
a Cohort II, one-year effects sample. The data for the two-year effects 
sample are presented in Table 48, and the data for the first-year subset 
of this sample, in Table 49, Finally, the datn for the Cohrort II, first- 
year sample are presented in Table 50, 

The baseline data for each of these samples indicate a serious mis- 
match of the FT and NFT pupil samples, FT pupils were below the NFT 
comparison pupils on entering abilities. Most of the FT were Black and 
came from very impoverished Black families; whereas most of the NFT pupils 
were non-Black, and very few came from families that met the poverty 
criteria. Because the samples are very different, we believe that the 
probability of inappropriate covariable adjusting is extreme. In fact, 
the two samples can be characterized as belonging to two different popu- 
lations on all covariables of interest. Consequently, we believe that 
any interpretation of the results of pupil and parent outcome analyses 
for this project would be invalid at this time. We present the project 
data for descriptive purposes only. 

Teacher and classroom observation data were also obtained for these 
two samples (thi3 Cohort I, two-year and the Cohort II, one-year samples). 
They point up additional differences between Ihe two groups. The Follow 
ThroUt;h teachers apparently have fewer book resources and less freedom 
tCj^ choose their assignments than the NiT teachers; on thp other hand, 
mtv(e classroom helpers are used in FT classrooms than in NFT classrooms. 
Measures of teac' 3r attitudes toward parents and toward their teaching 
methods show that FT teachert for the Cohort I sample are significantly 
more accepting of their methods than are the NFT teachers. This result 
did not recur in the Cohort II sample. 

The factors scores associated with the respective classrooms for 
these two samples reveal an interesting and repeated pattern. As would 
be expected from the sponsor's model, the FT classrooms are strongly 
characterized by a high programmed academic factor. The NFT classrooms, 
on the other hand, can be characterized by very low frequency of self- 
regulatory and child self-learning activities. The sponsor appears to 
have affected classroom processes; they are quite different from the 
processes occ/irring in NFT claslsrooms. 
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Summary 



The salient features of the University of Oregon model are summarized 
below: 

Focus and Objectives — emphasizes short range program objectives 
Child 

Cognitive (major emphasis) 

Develop competence in reading,- math , language, art, and music 

4 

Curricular Approach 

Teacher* s role is that of director 

Desired behaviors are systematically reinforced hy praise and 

pleasurable activities 
Teacher and two aides each work closely with small groups 
Highly programmed materials and structured environment 
Structured responsiveness expected on part of child 

Type of Parent Involvement 

Family workers personally contact all parents to acquaint them 
with program and child's progress 

Parent workers instruct parents in use of materials to supple- 
ment school program 

Encourage parents to participate in PAC and to volunteer in 
classroom . 

In many ways, this model can be^ considered the most structured and 
well defined of all the FT approaches. It is unfortunate that the FT 
and NFT samples for the Cohort I and II projects in this evaluation were 
so badly matched. The results of sponsor level analyses for Cohort I-K 
and I-EF are summarized in Tables 51 and 52. As discussed in the inter- 
pretation of project data, these FT and NFT samples represented distinct 
population subgroups. Hence, the results of significance tests on pupil 
outcomes are likely invalid, particularly in the analysis of the groups 
where the matching problem is acute. That is, the FT samples were char- 
acterized by Black children from poor families with low baseline scores, 
whereas the NFT groups were characterized by non-Black children with 
higher baseline scores from familes that can scarcely be characterized 
as disadvantaged. 
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These comparison group problems notwithstanding, these analyses do 
show ex'idence of greater parental involvement (K stream) and greater 
teacher acceptance of FT methods (K and EF streams). Also, the class- 
room observations conducted within these projects sliow evidence of a 
high degree of correspondence between the observed teacher processes and 
those specified by the model, suggesting that the pi'ogram is being appro- 
priately implemented. If, in subsequent cohorts or in subsequent measure- 
ments within these cohorts, a more acceptable degree of control group 
comparability can be established, then the impact and effects of thir: 
model on pupil gains can be properly assessed. 
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BEHAVIOR ANALYSIS APPROACH 
University of Kansas 

Sponsor *s Intended Approach 

The behavior analysis model is based on the experimental analysis of 
behavior, which uses a token exchange system, to provide precise, positive 
reinforcement of desired behavior. The tokens provide an immediate re- 
ward to the child for successfully completing a learning task. He can 
later exchange these tokens for an activity he particularly values, such 
as playing with blocks or listening to stories. Initial emphasis in the 
behavioral analysis classroom is on developing social and classroom skills, 
followed by increasing emphasis on the core subjects of reading, mathe- 
matics, and handwriting. The goal is to achieve a standard but still flex- 
ible pattern of instruction and learning that is both rapid and pleasurable. 

The model calls for careful and accurate definitions of instructional 
objectives, whether they have to do with social skills or with academic 
skills. Curriculum materials used describe the behavior a child will be 
capable of at the end of a learning sequence and clearly state criteria 
for judging a response as "correct," They also require the teacher to 
make frequent reinforcing responses to the child's behavior and permit the 
child to progress through learning tasks at his own pace. The child earns 
more tokens during the initial stages of learning a task and progressively 
fewer as he approaches mastery, the object being i;o move from external 
rewards to self-motivate i behavior. Since a child with few tokens to ex- 
change for preferred accivity is likely to be a child needing more atten- 
tion, the system guides the teacher in evaluating her own performance. 

In the behavior analysis classroom, four adults work togethbr as an 
instructional team. This includes a teacher who leads the team 'and assumes 
responsibility for the reading program, a full-time aide who concentrates 
oil small group math instruction, and two project parent aides who attend 
to spelling, handwriting, and individual tutoring. Parent aides are em- 
ployed on a rotating basis with other parents. They first serve as class- 
room trainees for a period of several weeks; some of these parents, in 
turn, become aides for a full semester. Full-time teacher aides are 
employed from the latter group , The short trainee cycle allows a great 
number of parents to become directly involved in the program. They then 
carry its main features into the home situation. 

Careful staff planning is an integral part of the behavior analysis 
daily schedule. Each day includes planning sessions, periods of formal 
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instruction, and special activity periods duriiig which the children ex- 
change their tokens for an activity they choose . Instruction and specia 1 
activity periods alternate throughout the day, with the amount of time for 
instruction increasing as the amount of reinforcement required to sustain 
motivatirn decreases. 

Evaluation of the model begins with an entry behavior inventory and 
diagnostic tests that determine where each child should begin a sequence 
of instruction and that also help to monitor his progress through the 
sequence. The curriculum materials used also provide for periodic testing 
and monitoring of achievement gains. Throughout the school year a com- 
puterized record-keeping system issues to the teacher a weekly progress 
report on each child and also reports progress for the class as a whole. 

Generally, implementa"** ion of the" behavior analysis model proceeds in 
three phases. In the first, the sponsor supplies substantial advisory 
support and training in the procedures and techniques of the program. In 
the second, local leadership takes over and local staff training coordi- 
nators assume more and more of the training and support responsibility. 
Finally, only periodic consulting with the sponsor is needed. 

Individual Project Results 

Five samples from three different projects sponsored by the University 
of Kansas (UK) were included in the analysis of interim effects. The dis- 
tribution of these evaluation samples in terms of cohort, outcome, and 
project is as follows: 

Cohort First-year Effects Second-year Effects 

IK (projects b & c) (projects a, b, & c) 

Project UK (a) 

Two-year evaluation data for Project UK(a) are summarized in Table 53. 
This relatively small project is located within the large Eastern popu- 
lation center. The anticipated FT expenditure was $763 per pupil, and the 
pupil/^AC member ratio was 9.6, 
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Baseline data indicate the two FT and five NFT classrooms were mod- 
erately comparable, but that more FT pupils came from poor Black families. 
Interestingly, the FT pupils averaged higher than the NFT pupils on most 
of the baseline measures. After adjustments for these baseline differ- 
ences, outcome data failed to indicate any significant FT /NFT differences 
on either pupil or parent variables. Since process data were not gathered 
for this project, no further interpretations are possible. 

Project UK(b) 

The second-year pupil data in Project UK(b) are summarized in 
Table 54^ and evaluation data for the first-year subset are summarized in 
Table 55. This project is also located within a large metropolitan area 
in the Middle Atlantic region. Project UK(b) was large (1,240 pupils), 
anticipated spending $832 per pupil and averaged one PAC member for each 
9,6 pupils. Because of the size of this project, total PAC membership 
was extensive (i.e., 129 members). 

The baseline data for this project indicate that FT and NFT pupils 
had similar entering scores. Alt were Black, and classroom compositions 
were reasonably equivalent. However, serious problems emerge in the 
pareni: data. The FT parents averaged below the NFT parents on educational, 
employment, and occupational levels, yet all NFT families were rated as 
poverty eligible. This phenomenon dramatically illustrates the missing 
data problem referred to earlier. In this project , poverty dat-a were 
available only for those NFT families who were, in fact, poverty eligible. 
Therefore, no data on those NFT families who were not poverty eligible 
were included. Since poverty is highly related to outcomes, we feel it 
should be included as a covariable. But restricting the data to subsets 
of complete data would eliminate one of the two NFT classrooms, hence 
precluding analysis. Thus again we are faced with a sample of data that 
cannot be adequately analyzed because of comparison-group problems. Our 
feeling regarding estimation of effects for this specific project is that 
outcomes are in favor of FT. Unfortunately, we cannot attach a signifi- 
cance level tc this interpretation, since the covariable values have pro- 
duced distortions in analysis. 

Classroom observation factor scores, which were available only for 
the FT classes, reveal that the instructional process corresponds closely 
to that intended by the sponsor. The strongest curriculum component, and 
the only one that is above average, is the programmed academic factor; 
the weakest are the child- initiated interactions and the self-regulatory 
factors . 
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Differences between one- and two-year data (I-K2 versus IKl) -^.n the 
unadjusted outcomes on pupil variables also show encouraging evidence of 
impact. That is, the two-year results show larger differences than the 
one-year results on every cognitive measure. Since these two groups were 
nearly equivalent at the baseline, the probability that this event would 
happen by chance alone is small enough (less than 2 percent for each set 
of outcomes) that we arc reasonably certain the program in this project 
is producing its intended impact on pupil growth. 

Project UK(c) 

The remaining University of Kansas project, located over 75 miles 
from a west north central urban area, was also moderately small (277 
pupils) with an average of 10 pupils per PAC member and an anticipated 
per-pupil FT expenditure of $773. Twc data samples were analyzed for 
this project: a second-year group and a first-year subset. These data 
are presented in Tables 56 and 57, respectively. 

The control variable data for the four FT and five NFT samples show 
a very good match. FT groups are nearly equivalent to NFT groups on pre- 
scores, classroom compositions, and all parent measures. In fact, the FT 
group appears only negligibly more disadvantaged than NFT. 

Differences in FT/NFT pupil measures fail to show any significant 
two-year program effects. Parent differences also fail to reach signifi- 
cance. Classroom observation data were not available to aid in inter- 
preting these results. 

Analyses of the one-year subset reveals that, at the end of one year 
of the program, significant FT-favoring differences existed on achieve- 
ment, WHAT, quantitative, and reading measures. Just why these differ- 
ences disappear in the two-year effects data is far from clear. One 
possibility is that NFT teachers are adopting the FT methods. Auother 
possible explanation is that FT teachers altered their procedures or 
levels of effort. But since teacher data were unavailable for this sample, 
any such explanation is speculative and unsupportable at this point in the 
evaluatir n. 
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Summary 



The salient features of this approach can be outlined as follows: 

Focus and Objectives — emphasizes short range program o'^jectives 
Child 

Cognit ive (major focus) 

Increase academic achievement in basic skills 

Affective 

Develop social skills 

Curricular Approach 

Teacher's role that of director 

Desired behavior is reinforced with tokens, which are later 

exchanged for activity of child's choice 
Individual focus with child proceeding at own rate 
Highly structured curriculum, plus free play time 
Structured responsiveness expected on part of child 

Type of Parent Involvement 

Train parents for direct involvement in classrooms as parent 
aides • 

Advise parents about how to continue education of child at home. 

The project by project evidence for the interim impact of this 
approach is mixed and, because of data problems, often uninterpretable . 
The one-year subset data appeared to indicate that significant academic 
progress was resulting from the program, but such evidence was not repli- 
cated in the two-year data. Results of the sponsor level analysis on the 
Cohort I-K projects are presented in Table 58. They suggest that the 
model has produced significant pupil gains on the cognitive process mea- 
sures. The results also show that NFT samples averaged significantly 
higher on the parent-child interaction measure. Because of missing data 
and the lack of comparability of the comparison groups, these results are 
probably invalid. Thus, we cannot confidently make statements regarding 
the relative impacts of this model at this time. 
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COGNITIVELY ORIENTED CURRICULUM MODEL 
High/Scope Educational Research Foundation 

Sponsor's Intended Approach 

The High/Scope Educational Research foundation model represents a 
synthesis of research in preschool and early elementary education. The 
program recommends an "open framework ' classroom that combines emphasis 
on active experience and involvement of the child; a systematic, consis- 
tent, and thoroughly planned approach to child development and instruc- 
tion by the teacher; and continuous e^ssessment of each child's level of 
development so that appropriate materials and activities can be provided. 
This approach is based on the conviction that telling and showing do not 
teach, but that active experience with real objects does. 

This approach uses a cognitively oriented curriculum, which takes 
into account the very real difference between the way children "think" 
and the way adults do. The model's aim is to nurture in children the 
thinking skills they will need throughout their school years and adult 
lives, as well as the academic subject competencies traditionally taught 
in the early elementary grades. It emphasizes and is designed to support 
the process of learning rather than particular subject matter. It is 
central to High/Scope's program that learning should be active, that it 
occurs through the child's action on the environment and his resultant 
discoveries . 

Each month one or more sponsor rtaff members spend up to a week at 
each project site. Field Consultants assist with issues relating to the 
instructional model: room arrangement, scheduling, teaching methods, 
planning, learning centers, and the like. Program Specialists deal with 
specific academic areas — math, science, social studies, and communication — 
and with the curriculum materials, both commercially developed and those 
prepared by the sponsor. Curriculum Developers and administrative per- 
sonnel also travel to projects as often as is necessary and feasible. 

High/Scope Foundation staff present three major training and planning 
workshops at the Foundation during the year — in the spring, summer, and 
winter. In the fall, they conduct individual workshops at each project, 
primarily for teaching staff. In adc2ition, High/Scope Foundation oper- 
ates laboratory classrooms to increase the scope and versatility of 
training and curriculum development activities. 
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staff at projects include a project director, curriculum assistants, 
classroom staff, parent program staff, and home visitors. Each classroom 
has two teachers and an aide, or a single teacher with two aides, who 
operate as a teaching team. The instructional staff is supervised by 
and receives continuing inservice training and program moni tor ing from 
the local Curriculum Assistant (CA) . The CAs therefore receive the most 
extensive training by Foundation staff. CAs bear prime responsibility 
for planning, demonstrating, and evaluating activities in the six to 
eight classrooms under their supervision and, in general, for ensuring 
smooth implementation of the High/Scope model at each field site. 

The parent program and home visit staff vary according to local 
needs and objectives . Each local project essentially designs and imple- 
ments its own parent program, with general guidelines and consultation 
from High/Scope Foundation staff. 

The home teaching component of the program consists of planned visits 
to the home by classroom teachers or individuals hired specifically as 
home visitors. The child, a parent, and the home visitor work together 
during the visit, focusing on current and past activities at school and 
on supportive activities that may be carried out at home. 



Ind ividual Project Results 

Five samples from three different projects sponsored by High/Scope 
Educational Research Foundation were included in the analysis of interim 
effects. The distribution of these evaluation samples in terms of cohort, 
outcome, and project is as follows;: 

Cohort First-year Effects Second-year Effects 
IK (Project c) 

lEF (Projects a & b) (Projects a & b) 



Project HS(c) 

The evaluation data for Project HS(c) are presented in Table 59- 
This is a relatively small kindergarten entrance project located in a 
large, racially mixed urban mid-Atlantic population center. The project 
anticipated near average per pupil expenditures and maintained a lower 
than average PAC/pupil ratio of approximately one PAC member for every 
30 pupils. 
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The baseline data for this project show some highly unusual patterns, 
which suggest to us that the prescores on the tests may be invalid. Spe- 
cifically, the families constituting this cohort sample are poor, Black, 
and below average in educational and employment levels. Also, these FT 
families display the commonly noted pattern of being more disadvantaged 
than their comparison groups. However, the baseline test scores show l^T 
pupils consistently (and significantly at the .05 level) above NFT pupils 
on all measures. This difference in scores is in sharp contrast to the 
characteristic and understandable pattern noted in all other projects in 
this evaluation: namely, the direct relationship between prescores and 
poverty indicators. Furthermore, this prescore bias cannot be attributed 
to preschool experience since the FT pupils had proportionately less such 
experience than the NFT pupils. 

If (as we strongly suspect but are unable to confirm*) the prescores 
were inflated in favor of FT , the co variance adjustments would cause the 
resultant FT/NFT contrast to be seriously biased against FT. Since nearly 
all cognitive test variables show significant differences in favor of NFT, 
we suspect that such biasing occurred. The exceptions are the cognitive 
process measure, the affect measure (significantly in favor of FT) and 
the attendance measure (also significantly in favor of FT). None of the 
parent measures showed significant program effects. 

We are faced with two alternatives. Either we (a) accept the project 
data at face value and interpret the pupil results as showing the FT ' * 
group as well above NFT on pretest measures but well below NFT on post- 
test measures, thus producing evidence that the FT program hindered pupil 
development, or (b) consider the pretest data as invalid and exclude the 
project data from our interpretation of the effects of this model. Al- 
though we can find no independent evidence to support the interpretation 
that baseline data are invalid, we believe that the circumstantial evi- 
dence is sufficiently compelling to make exclusion of this project from 
the evaluation of this sponsor* s effects the more prudent course of 
action. 



* 

Fall 1969 baseline data were collected under the decentralized field 
operations procedure. Information regarding procedures used in specific 
sites was available only through records provided by site personnel. 
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Project HS (b) 



The evaluative data for the Project HS(b) sample, a Cohort I, Enter- 
ing First Grade sample, are summarized in Table 60 for the second-year 
effects, and in Table 61 for the first-year subset. This moderately 
sized project is located in a primarily white rural community in the 
south Atlantic region. The anticipated per-pupil expenditure of $928 is 
slightly above the overall average, and the rate of one PAC member per 
20 pupils is about average. 

Baseline averages for the four FT and six NFT classroom.s included 
in this project sample show that the two groups lack comparability on 
nearly all indicators. The FT classes systematically (and significantly) 
scored lower on all baseline measures • FT pupils were primarily Black, 
whereas NFT pupils were primarily non^Black. FT pupils came from substan- 
tially more disadvantaged home environments than NFT pupils (their parents 
were poorer, ethnically different, less skilled, and less well educated). 
Teachers of the two groups were relatively comparable, although NFT teachers 
were better ir.tegrated into their pupils' communities than were FT teachers. 



Results of analysis show that NFT pupils scored significantly higher 
than FT pupils on the affect measure and that NFT parents felt a signifi- 
cantly greater sense of control than FT parents did. All other results 
failed to reach significance; in general, FT groups scored lower than NFT 
groups. First-year data for these same pupils show similar results. Com- 
parison of first- and second-year effects suggests that the program is 
failing to produce the targeted improvements on pupil, parent, and teacher 
outcome variables. However, the evidence of control group bias is so 
pronounced as to suggest that the groups in our sample are drawn from two 
initially distinct populations. If they are, then the probability of 
inappropriate comparisons and differential regression on covariableb is 
increased and serious questions are raised about the validity of this 
project analysis. We believe the risk of faulty interpretation resulting 
from essentially invalid comparisons is sufficiently great iu this case 
to warrant exclusion of this project from the interim evaluation. 
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Project HS ya) 



The evaluation data for project IIS (a), also a CI-EF sample, are 
presented in Table 62 for the two-year effects, and in Table 63 for the 
one-year effects. Thii3 project is located in a rural community in the 
east south central portion of the United States. The project is moderate 
in size (450 pupils) and primarily Black in ethnic composition. It antic- 
ipated spending approximately $940 per pupil and had an average of one 
PAC me.nber lor each 32 pupils. 

The four FT an*' the five NFT classes were reasonably comparable on 
baseline ability measures, but differed in classroom composition (NFT 
classes had proportionally more males and Blacks* and more preschool ex- 
perience than FT classes). FT families differed from NFT families in 
only two respects. The FT heads of household >vere more likely to be male 
and currently working than were the NFT heads of households. It should 
be noted that both FT and NFT families were severely impoverished, with 
very few parents having high school educations or skilled occupatioVis 
rnd nearly all families meeting the OEO poverty cri*teria. Data on teachers 
were inadequate for analysis. ^ 

Results of analyses of these second-year data display significant 
FT-favoring differences only on days absent (95 percent confidence 
interval = 3.^ to 19.2 days). But on the WRAT and reading outcomes, 
FT pupils showed significantly less gain than NFT pupils. Analysis of 
first-year results for these pupils (Table 63) indicate that the FT def- 
icits are cumulative. For both first- and second-year measurei..ents , the 
results favor NFT, and by the second year some of the differences become 
si:?ni.f icant. However, the FT-favoring affect difference remains signif- 
icant across both analyses, so apparently FT pupils have more positive 
attitudes and are learning the measured academic skills at a slower rate 
than NFT pupils iu this project. 

That this program had substantial impact on parents is evident from 
the analysis of parent data. The FT parents significantly differ from 
the NFT parents on the parent/child interaction scale, the parent/school 
involvement scale, and the parent expectation scale. These outcomes sug- 
gest that the model's emphasis on parental participation did succeed. 



Table 62 shows FT classrooms sampled averaging 82 percent Black and the 
parent interview sample as 72 percent B]ack. Yet according to the sponsor, 
all participants at this site (both FT and NFT) were Black. This conflict 
reflects further on data reliability problems. 
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Factor score profiles obtained from classroom observation data sug- 
gest that the FT classrooms were not really all that different from the 
NFT classrooms. Only slight differences on most of the process components 
are noted. The FT classrooms can be best characterized by the expressive- 
ness factor (as can NFT) , and both appear low on the isolatory components 
(self-learning and self-regulatory). An identifiable salient process 
corresponding to this model is not evident in these data. 



Summary 

The salient features of the cognitively oriented curriculum model 
are outlined as follows : 

Fo cus and Objectives — emphasizes long range program objectives 
Child 

Cognitive (major emphasis ) 

Develop thinking skills 

Develop competence in basic skills 

Curricular Approach 

Teacher's role that of facilitator 
Reinforcement primarily from activities 

Individual child *s development ir continuously assessed, 

and appropriate materials are provided 
Group session (whole class situation) used to plan and 

revise daily activities; otherwise three adults work 

within open classroom framework 

Type of Parent Involvement 

Home visitors work with parents to plan child's 
activi ties at home , 

j 

Since the serious comparison group and data collection problems noted 
or suspected for two of the three samples lead to our recommendation that 
these samples not be included as evaluative evidence regarding the model's 
eifects to date, our judgments and comments must be based on analysis of 
a single project. 

f 1 



Clear evidence that classroom processes in FT classrooms differ from 
those in NFT classro^oms fails to occur in other High/Scope projects, as 
documented in the SRI FT Classroom Observation Study, 1972(b). 
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Data on this project indicate that the program has not yet produced evi- 
dence of positive impact on the development of competence in certain aca- 
demic skills greater than that displayed by the NFT pupils. In fact, on 
the WHAT and reading measures, NFT pupils average significantly above FT, 
It is possible that the current evaluation variables are not appropriately 
sensitive to many of the cognitive behaviors that the model purports to 
develop. Also, since process factors showed high FT/NFT similarity, it 
may be that the model was not well implemented in the FT school. Nonethe- 
less, the data show FT pupils are clearly behind NFT in the learning of 
certain basic skills as measured by the FT battery. 

The parent results do indicate the model has been successful in de- 
veloping parent/child interactions, more involvement in school, and more 
positive expectations for their children's success. 



ERLC 



200 



FLORIDA PARENT EDUCATION MODEL 
University of Florida 

Sponsor ^ s Intended Approach 

As the name of this model implies, its primary focus rests on educat- 
ing parents to participate directly in the education of their children and 
motivating them to build a home environment that furthers better perfor- 
mance on the part of the child both in school and in life. Basic to the 
model is recognition of the fact that parents are a key factor in the 
emotional and intellectual growth of their children and that they are 
uniquely qualified to guide and participate in their children's education. 

The Florida model is designed to work directly in the home. It is 
not classroom oriented in the traditional sense of having a preset cur- 
riculum or prescribed teaching strategies. It is developmental in its 
approach, changing classroom organization, teaching patterns, and the 
curriculum as needed to integrate learning activity in the school with 
that in the home. Learning tasks are developed that allow the home and 
the school to work as instructional partners. Thus, responsibility for 
curriculum development resides in the community, and the curriculum is 
the product of parent and school staff cooperation, 

Paraprof essionals play an especially significant role in this model, 
working in the home and in the classroom. Mothers of project children 
are trained as both teacher auxiliaries and as educators of other parents 
and are assigned two to a classroom. They work half-time assisting the 
teacher and the rest of the time making home visits, demonstrating and 
teaching other mothers learning tasks developed to increase the child's 
intellectual competence and personal and social development. While in 
the home the parent educator also actively solicits ideas and information 
on which strategies are working from the parents. 

In addition to h<^r instructional role, the parent educator acts as 
liaison between the project overall and the home, serving as a referral 
agent for medical, dental, psychological, or social services. She in- 
forms the parents about Policy Advisory Committee meetings and other 
school/community functions in which they should become involved. Her 
experience with the children in the classroom setting as a teaching assis- 
tant enables her to keep individual parents up to date on their child ^s 
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specific needs. This highly active role of the paraprof essional is cru- 
cial to the operation of the Florida model. 

The teacher supervises t^e classroom activity of the parent educator 
and assists her in planning and carrying out her assignments in the home. 
Conversely, the teacher modifies her own activity on the basis of knowl- 
edge obtained from the parent educator's reports on the home. Parents 
are invited into the classroom not as passive observers but to partici- 
pate actively in the instruction. Through such persistent contact the 
teacher learns and grows along with the parent and obtains a sound basis 
from which to guide preparation of learning tasks. 

Recognizing the role of the Policy Advisory Committee is basic to 
the program, each school develops a "mini-PAC" that participates in the 
activity of the larger Follow Through PAC, The larger PAC group is in- 
volved in staff selection, budgets, working with project professionals 
on development of home learning tasks, and in strengthening all compo- 
nents of the program. 

Both preservice and inservice training are provided by the sponsor 
in implementing the model, A workshop at the University of Florida 
trains a cadre of teachers and parent educators along with such other key 
personnel as Follow Through representatives, principals, and PAC chair- 
men. People attending this workshop, in turn, conduct workshops at the 
project site. Video tapes madt in the classroom and in the home guide 
the sponsor in addressing problems pertinent to model implementation and 
development. Projects also provide the sponsor with copies of their home- 
learning tasks, weekly observation reports, and replies to attitude ques- 
tionnaires. All such information is collected subject to review and ap- 
proval by the PAC, The flow of information among the sponsor, the local 
education agency, and the parent community reflects the team partnership 
emphasis of the model and gives the education of individual children its 
direction and shape. 

Individual Project Results 

Six samples from three different projects sponsored by the University 
of Florida (UF) were included in the analysis of interim effects. The 
distribution of these evaluation samples in terms of cohort, outcome, and 
project is as follows: 
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Cohort 



First-year Ef f ect s 



Second-year Effects 



lEF 



IK 



(project c) 



(projects a&c) 
(project b) 



IIEF 



IIK 



(project a) 
(project b) 



Project UF(a) 

This project is located in a south Atlantic city of approximately 
500,000 residents, nearl / one fourth of whom are Black. The project an- 
ticipated an above average per-pupil expenditure of $1,023 and had a very 
high PaC participation of one member for each 5.5 pupils. Two sets of 
data are examined below — second-year outcomes for Cohort I and first-year 
outcomes for Cohort II. 

Cohort I data are presented ir Table 64. Baseline data indicate that 
children in the four FT and ten NFT classrooms included in the analysis 
scored fairly comparably on test:^. taken as they entered school. Although 
more of the FT children had attended preschool, a substantial percentage 
of NFT children (56 percent) also had preschool experience. 

All FT children were Black, and rosters indicate that 28 percent of 
the NFT pupils were non-Black. However, this information is inconsistent 
with the data on families obtained by interview. All of the parents inter- 
viewed, both FT and NFT, were Black. FT parents were more likely to be 
uish school educated and slightly more likely to have a skilled occupation 
and to be employed. They are also described as slightly more likely to 
meet poverty eligibility requirements, which is somewhat inconsistent with 
the general trends in these interim data. 

Teacher data show that both FT and NFT teachers were fairly well 
trained, with FT teachers having more experience than NFT. Although NFT 
teachers reported slightly higher job satisfaction, most of them had 
apparently been assigned to the school and classroom in which they taught. 
FT teachers, on the other hand, indicated a fairly high level of freedom 
to choose teaching assignments. Although NFT teachers reported more book 
resources, they did not report more classroom helpers- 
Child outcomes on both overall cognitive measures (WRAT total and 
overall achievement) show that NFT children improved significantly more 
than FT children. Among the cognitive skills, the only significant dif- 
ference was in language arts; again NFT pupils showed more improvement. 



203 



OH - 



U SI 



r HO 



ID c^ i-H 

X in £7) 



8 a 



« H '-^ 



N O 00 N m 



^ 2 

u u 

5 t 



in o CO QC 
n w M o T c o 



01 N 

^ in N 



o a 

Hi O 

W K 

U M 



in o lO in c o PI 



i^woooJinN'r'rN'J' 



(O C C O 

O r-l 01 



t~ in N o o X 

n 0^ oj o o to 
10 N o o in 



C c in in 



o o o o 



in oi «j 
n in <r 



O r-l N — I Li o 



in T o in 01 



N O in in oi 



o in o 

N N 



u c: c: o u 

CS Q. O U OS ^ 
O U CD O tH 



; 8 2 



U a ccl 

^ w w 

3 3 



5 I? 



u u u ci. a 



; 3 3 

I o o 



a. X X 

b»i t*- 6^ 



6 


5 


a 




a 


[LD 


)CHOOI 


g 


Q 


o 




< 




> 


o w 


u 

W < 
O -3 




CU CQ 







W (J o o 

s s 

(J u 

a: D s 

r o o 

H t- s = 
a: {C 

gui Q o 

> < < 

U Q U U 

cc S, X X 

i,« b* 



S g 



tfl o < 
— w J 



2 £ 



< u 

CA u oc: 
M 

CO Q » 

0, 5 y 

J w «t 

w C g 

fcu H «?J 

o 

at 



204 



Although the slightly higher FT scores on affect measures do not reach 
significance, the attendance measure favors FT significantly (95 percent 
confidence interval of -20.4 to -10.0, with negative scores indicating 
lower absence rates). 

Classroom observations of FT and NFT classrooms in Project UF(a) with 
Cohort I pupils in their second year of school (first graders) showed that 
the two sets of cl ass rooms were similar in many respects , For example , 
both FT and NFT classrooms were low (relative to all classrooms observed) 
and similar to one another on the self-regulatory factor. They were also 
somewhat below overall averages and similar to one another on the child- 
initiated interaction and the child self-learning factors. Their greatest 
relative differences were on the programmed academic and the expressive 
factors. On both these factors, FT and NFT classrooms were below the 
overall means for all classrooms observed. However, the relative dif- 
ferences between FT and NFT groups were substantial; the NFT group was 
relatively higher on the programmed academic factor, and the FT group 
was relatively higher on the expressive factor. This difference in em- 
phasis may help account for NFT superiority on the achievement measures 
and FT superiority (or equality) on the attendance and attitude measures. 

The model had some positive impact on the FT teachers. They reported 
a significantly higher acceptance of FT and its innovations than NFT 
teachers reported of their methods. But on the parent image variable FT 
and NFT did not differ significantly. Since the University of Florida 
employs parents as paraprof essional home-school coordinators, it is pos- 
sible that teachers felt it was not essential for them to contact parents 
personally outside of the classroom. However, the complexity of the 
parent image variable does not exclude alternative interpretations. 

Data for Cohort II are presented in Table 65. Pupil data are based 
on three FT classrooms and two NFT classrooms. FT and NFT children appear 
to have been comparable in ethnicity (all are Black) and education of their 
parents (64 percent without high school diplomas). However, more of the 
FT children are male (58^6 percent, compared with only 39.7 percent of the 
NFT children), a higher percentage of FT children came from homes meeting 
poverty criteria and in which the head of household was not employed. 
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Nevertheless, FT and NFT scores on baseline tests are relatively comparable, 
except on reading, where the FT children are two points behind the NFT 
children.* 

Teacher covariable data indicate that FT and NFT teachers were com- 
parable in terms of job satisfaction and community membership. Whereas 
Cohort II, FT teachers were more experienced than 1,"T teachers, both groups 
had somewhat less experience than did teachers in the Cohort I sample. 
Also, Cohort I, FT teachers reported a fair amount of freedom to choose 
the school and classroom in which they worked, while Cohort II, FT teachers 
were apparently assigned, and NFT teachers chcse their assignments. 

Outcome data for the children in Cohort II are favorable to the model. 
FT children score above NFT children (at a 95 percent level of confidence) 
on all achievement outcomes except cognitive processes, where the differ- 
ence favors FT but does not reach significance. These FT-favoring differ- 
ences represent a sharp contrast to the effects observed for Cohort I. 
Indeed, comparison of Cohort ^ to Cohort II results strongly suggests an 
improved implementation effect . 

That is, although outcome measures for parents all fail to display 
significant differences, the parent data show a small but consistent ten- 
dency to favor FT. FT teachers, on the other hand, responded essentially 
no differently than NFT teachers regarding parent image and acceptability 
of method, and given the parent role emphasis of this model, the Project 
UF(a) pupil results seem to indicate that the effectiveness of this in- 
volvement is improving with successive samples of parents and pupils. 

Classroom observation data for Cohort II indicate that the FT classes 
were below the overall average (as were the NFT classes) in the scaled 
process dimensions. The FT classes can be characterize:', as more expressive 
and less structured than NFT. This pattern is not inconsistent with the 



Data on the preschool experience of NFT children in this cohort present 
a problem. Data from rosters indicate that NFT children did not have pre- 
school experience, while data collected from parents indicate that they 
did. In this case, the latter set of data is more likely correct, since 
rostering problems were encountered for this set of pupils. Thus, the 
child covariable data are slightly biased in favor of NFT outcomes. But 
since the observed differences are, in general, large and in favor of FT, 
the probability that the interpretation would change because of this bias 
is considered remote. 
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emphasis of the model that parents should facilitate the classroom process 
and assume more of a direct role as the principal educators of their 
chil dren. 

Project UF(c) 

This project is located within a large city in the middle Atlantic 
region. Slightly more than one third of the 4 million residents are non- 
white. The anticipated per pupil expenditure was $826, and the pupil/PAC 
ratio was 6.8, which suggests above-average parent involvement on the PAC. 

Baseline data describing second-year effects on children in the five 
FT and four NFT, Cohort I-K classrooms (Table 66) included in the analysis 
consistently show FT children with higher scores, in some cases substan- 
tially higher, than the comparison group. Although the two groups were 
similar in ethnic composition (almost totally Black) and had approximately 
the same employment rate for heads of household (about 50 percent), the 
parents differ considerably in terms of percentage with high school edu- 
cation and skilled occupation. FT children scored consistently higher on 
pretests with the largest differences (8.6 points) occurring on the read- 
ing measure. 

FT classes averaged higher on all pupil outcomes than NFT classes, 
but none of these differences is significant. 

One-year data for Cohort I children are presented in Table 67. As 
was the case for the second-year data, none of the child outcome measures 
reached significance at a 95 percent level of confidence. However com- 
parison of differences between the first- and second-year results shows 
a trend toward increasingly positive impacts on FT children. 

None of the parent outcomes reached significance, but the high degree 
of parent participation in the PAC group is encouraging. 

It is interesting that FT and NFT teachers display similar job satis- 
faction ratings, since NFT teachers have a great many resources available 
to them and more freedom to choose assignments than FT teachers. 

The ethnic difference between NFT teachers and their students is 
especially interesting iu view of the teachers* evaluations of the impor- 
tance of pp.i-ents to education. NFT teachers viewed parents as being an 
integral part of the educational system outside of school time more fre- 
quently than did FT teachers. 
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The process profiles obtained through classroom observation for IT 
and NFT pupils Cohort I in Project UF(c) were fair?.y similar. Both showed 
low scores (relative to the overall average) op the child-initiated inter- 
action and the self-regulatory factor^ and both were moderatley low on 
the expressive factor. FT and NFT classes were most different on the 
child self-learning and the programmed academic factors. On the child 
self-learning factor, FT classes were very much higher than NFT classes 
and subVt^^anJjrirlly above the mean for all classrooms. On the programmed 
academic factor, FT classes were lower than NFT classes although neither 
deviated radically from the overall average. 



These pattern differences in p 
favorable differences on achievemen 
measures, although the magnitude of 
statistically significant. 



rocess may have been related to the 
t , affect , and attendance outcome 
these outcome differences is not 



Project UF(b) 

This project, located in the west south central region, is far from 
the nearest SMSA, which is of moderate size (664,000). The community's 
population is only 6 percent Black. ,The proportion of Black children in 
the two FT samples was about 15 percent. This project anticipated a very 
low per-pupil expenditure of $516. Uata on this entering first grade 
project are available for second-year effects on Cohort I and first-year 
effects on Cohort II. 



Data on Cohort I are summarized in Table 68. Baseline data show 
that the s±:: FT apd three NFT classrooms v/ere moderately comparable on 
the prescore measures, although FT children tended to average below NFT 
children. The families are also moderately comparable, although again 
IT families appear more disadvantaged than NFT families on all demographic 
indicators (education, occupation, income, etc). The high employment 
rates displayed by both samples suggests farm-worker families in the rural 
south. 

I 

Analyses of the pupil outcome variables failed to reveal any signif- 
icant two-year FT impacts for this project. Analysis of parent outcomes, 
however, showed a sigiiif leant FT-favoring difference on the parent- school 
involvement variable. This outcome is compatible with the model's major 
emphasis on parental involvement in the education of their children. 

Teacher data for this sample show FT teachers as somewhat more satis- 
fied with their working conditions and more likely t "> have classroom help- 
ers than NFT teachers. On the other hand, NFT teacuers reported more book 
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resources, were more closely tied to the neighborhoods of their pupils, 
were freer to choose assignments, and appeared far more experienced than 
FT teachers. Analysis of teacher outcomes in terms of these differences 
failed to reveal significant FT/NFT differences. However, the magnitude 
and nature of this overall pattern of differences certainly suggested 
substantial lack of comparability between these two samples. 

Covariable data on families for Cohort II children (see Table 69) 
display the same tendencies as the data for Cohort I, although the FT/NFT 
differences are more pronounced for Cohort II in every case. FT families 
in Cohort II were less well educated, less well employed, and more often 
classified as poverty eligible than both their NFT comparison and their 
Cohort I, FT counterparts. FT and NFT pupils had had similar preschool 
experience. The NFT group scored slightly higher on all baseline test 
measures except reading. The two groups also lacked comparability in 
classroom composition, the FT class containing slightly more girls than 
boys, while the NFT group was almost all male. 

Like the Cohort II data for Project UF(a) , child outcome data sig- 
nificantly and consistently favor FT over NFT. These gains are indeed 
impressive, since they are reflected on overall cognitive outcomes (achieve- 
ment and WHAT), specific skills (reading, language, and quantitative skills), 
and affect measures. 

Covariable teacher daoa indicate that the FT teachers in Cohort II 
classrooms were considerably more experienced than those in Cohort I class- 
rooms; they were also more experienced than the NFT teachers . with whom 
they were 'Compared. They identified somewhat more strongly wita the com- 
munity, and report%4_-Slightly higher book resources and more classroom help- 
ers than NFT teachers. Perhaps partly because of the presence of addi- 
tional resources and helpers, FT teachers indicated greater job satisfac- 
tion. Although the adjusted outcjomes still favor the FT group, neither 
is significant. 

Data on parent impacts were not available for this cohort., 
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Summary 



The salient features of .the University of Florida approach can be 
summarized as follows: 

Focus and Objectives — emphasizes long range objectives 
Parent 

To educate parents for direct participation in the education 

of their children 
To motivate parents to build a home environment conducive 

to learning 

Child 

Increase intellectual competence 
Promote personal and social development 

Curricular Approach 

Parent educators divide time between assisting teacher in 

classroom and making home visits to demonstrate learning 

tasks to parents 
Parents also participate actively in the classroom quite 

often 

Type of Parent Involvement 

Direct both at home qnd at school. 

The primary concern of this model is to increase the amount of paren- 
tal involvement in the educational process- The goal is to accomplish 
this by educating parents for direct participation in the education of 
their children and by motivating parents to build a home environment 
conducive to learning. 

In summarizing the results of the interim analyses, it appears that 
the Florida approach has met with mixed one and two year success on 
Cohort I project samples- The sponsor level analysis (Table 70) on 
Cohort I-K projects fails to reveal any significant overall parent or 
pupil impacts. This absence of results, particularly for parent/school 
involvement, is possibly due to implementation problems associated with 
these samples or, equally likely, to data problems. That is, as has 
been noted for other approaches. Cohort I data often yield conflicting 
results . 
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The analyses of teacher and parent outcome variables generally show 
greater acceptance of Follow-Through by FT teachers and greater parental 
involvement for FT parents. This pattern is interpreted as positive evi- 
dence of attainment of some basic objective of this model. The overall 
finding (see Table 70) that Cohort I FT teachers held a significantly less 
positive parent educator image may, in part, reflect implementation diffi- 
culties. An alternative explanation may lie within the mechanics of the 
Florida model itself, making this finding wholly acceptable, if not antic- 
ipated. Specifically, in this model contacts with the parents are gen- 
erally initiated by the classroom paraprof essionals (parent educators) 
rather than by the teachers. The finding that NFT teachers consider con- 
tacts with parents outside the classroom more essential than do FT teacher 
may, then, be explained by the Florida's model ^s delegation of this re- 
sponsibility to the parent educator. 

In general, the Cohort II samples produce much stronger and more 
promising evidence of the efficacy of the Florida approach. Since 
Cohort II samples contain far fewer data problems, and since improved 
outcomes may reflect improved implementation of the model, we feel that 
these latter data should be stressed. 



216 



o 

< 
Eh 



CO 
Eh 
U 

w 
I 

o 

Eh 



CO 
Eh 
CJ 
W 

o 

ft! 

o 

HH 

ft! 
W 
Eh 

:z: 
w 

:z: 
w 

Eh 



o 



H 
I 

CD 



Eh 
< 
Q 

W 

o 

u 

Eh 

o 

@ 

Eh 
CO 

»-3 



w 
u 

w 

Q 

o 

u 



o 



CO 



w 

Eh 



w 

CO* 



M 



Eh 



Eh 



Eh 



w 
o 

o 



00 O) CO cv] O) O H 
H '^J^ CO ^ H 

00 lO 05 O lO CM 



O) in 



00 in 

1 I 



H CO CO 00 i> tn 
in CD ;o o CO CM i> 



CD 



CO CM 



CO 



O 00 O) CM 00 CO 

in cNi I 7^ ^ 

^ CD CO CD H O) CO 



00 u> VO o 
CM H t> ^ 



CM 

in CM 



^ CO CD l> CO H O 



t> CM t> H in 

H t> CO in CM 



^ t> ^ ^ in H ^ 
1 I I 



CO ^ in ^ 00 CD ^ 



CM 
CM 



l> CM 00 CD H in 
H CO in CM 



CD o o 00 CO in o 

• •••••• 

CM 00 CM 00 H in 

CM H CO in CM 




05 CO 
CM CD 



in CO 



05 t> 
H CM 



00 o 
O H 



00 00 
O CO 



05 

o ^ 



^ in 

H CO 



CM 



CD 

in 



a I 

a CO 



O l> 

CO ^ 



00 05 

00 in 



o 

CO CM 



05 CD 
CM O 



O H 

in o 



CM 



o 



O) in 
o o 



00 00 
CM O 



05 CO 
H O 



I 

o 

w 
u 



< 



K Eh 

a 



CD O 

CO '^J^ 



O CO 
O) CO 



CM 



CO 00 
CD O 



in CM ' 

05 CM 



CM O 
CO CO 



CM H 
^ CM 



o 
to 



O 00 
CO CM 









Eh 








































W 


§ 






a 






a 








< 






< 




o 








a 


< 




:z: 








o 






? 




o 






CO 






Eh 


CO 


w 






w 


CO 


W 








K 




O 


M 


a 




Dd 


CO 


< 




Oh 



















cn 
(D 

u 

c 



03 

m 
o 

c 
o 



c 
o 



CO 
+J 
fH 

cn 

ft; 

o 

n 
o 

•H 

4-> 
CJ 
-P 
(D 
5^ 

^ 
0) 
+^ 

c 



o 
m 

0) 
O 



0) 
0) 
CO 



ERIC 



EDC OPEN EDUCATION PROGRAM 
Educational Development Center 



Sponsor's Intended Approach 

The ED(3 FolloVr' Through approach is a program for helping communities 
generate the resources to implement open education. It is not specifi- 
cally a program in compensatory education because it is based on princi- 
ples EDC considers relevant for the education of all children. The 
approach is derived in part from ideas and practices evolved over many 
years in British infant and primary schools. It also draws heavily on 
knowledge of child development gained during the last 50 years and on 
EDC experience in curriculum and school reform, EDC believes that learn- 
ing is facilitated by a child* s active participation in the learning 
process, that it takes place best in a setting where there is a range 
of materials and problems to investigate, and that children learn in 
many different ways and thus should be provided with many different 
opportunities and experiences. In other words, the ability to learn de- 
pends in part on the chances to learn provided by the educational setting 

The classrooms are "open," and the children usually choose their 
activities, drawing on a great variety of materials in the room. The 
room is often divided into several interest areas for activities in makin 
things, science, social studies, reading, math, art, and music. Small 
groups of children use any or all of these interest areas during the day. 
In addition, traditional subjects may be combined with any one interest 
area. Whether or not interest areas are physically set out, the open 
classroom is characterized by an interaction of subject matter and by 
purposeful mobility and choice of activities on the part of the children. 

The child* s experience Is one of the starting points for teaching 
in an open classroom; the teacher's input is another. The role of the 
teacher is an active one. Teachers lead children to extend their own 
projects, through thoughtful responses and suggestions. The classroom 
is carefully supplied with materials that are likely to deepen children's 
involvement. The teacher occasionally works with the entire class but 
more often with a small group or an individual child. Aides and other 
adults also participate in teaching roles. 
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Traditional academic skills are important in the open classroom and 
children have many opportunities to develop them in flexible, self-directed 
ways that allow learning to become a part of their life style outside as 
well as in the classroom, EDC believes that if children are going to 
live fully in the modern world, the schools must embrace objectives that 
go far beyond literacy training, the dissemination of information, and 
the acquisition of concepts. This approach is concerned with children's 
growth in problem-solving skills, their ability to express themselves 
both creatively and functionally, their social and emotional development, 
and their ability to take responsibility for their own learning. Accumu- 
lated experience in early childhood education in this country and overseas 
suggests that these larger aims must be taken seriously from the very 
outset of formal schooling, and that the environment that provides for 
them also provides a sure foundation for academic learning. 

An EDC advisory team makes monthly visits to the community to assist 
the schools in making the changes needed to develop open education. EDC 
policy is to v.'ork in places with individuals who are ready for change, 
who have a sense of the directions in which they want to move, and who 
need and request advisory help. 

The advisory team does not attempt to impose specific ideas or methods 
but tries to extend what individuals are capable of doing. The team helps 
by suggesting appropriate next steps and provides continuing support to 
teachers and aides. It conducts workshops for teachers, aides, parents, 
and adminir^trators ; works with teachers and aides in the classroom; pro- 
vides appropriate books and materials; helps teachers and aides develop 
their own instructional equipment; and assists school administrators with 
problems related to classroom change. 

EDC is convinced of the important role parents can play in the edu~ 
cation of their children. Parents have a right and a responsibility to 
be involved in all decisions affecting their children. In addition, the 
teacher's effectiveness is greatly increased by his knowledge of a child's 
life outside of school. The EDC advisory team helps teachers, aides, and 
administrators work with parents to make them better informed about the 
open education program, to use parents as an important resource for 
knowledge about the children, and to involve parents in decisions con- 
cerning the education of their children. 



220 



Individual Project Results 



Seven samples from three different projects sponsored by the Educa- 
tional Development Center (ED) were included in the analysis of interim 
effects. The distribution of these evaluation samples in terms of cohort, 
outcome and project is as follows: 



Cohort First-year Effects Second-year Effects 

IK (projects b & c) (projects b & c) 

IE (project a) (project a) 

IIK (project b) 



Project ED (c) 

Data describing two-year impacts on Cohort I-K are presented in 
Table 71. T'nis rather small project (300 pupils) is located within a 
major urban area in the south Atlantic region. The FT children for whom 
data were included were all Black; in contrast, only 72 percent of the NFT 
children were Black. 

There is evidence that the FT and NFT samples were not well matched 
on several variables. For example, the NFT sample reported a higher 
percentage of skilled employment and a substantially higher level of head 
of household employment than did the FT sample. As might be expected, 
tho FT sample more frequently met poverty level guidelines than did the 
NFT sample. 

The lack of comparability of the two groups is also reflected in the 
child baseline variables. More of the NFT children had had preschool 
experience, and the NFT group almost uniformly outperformed FT children 
on baseline test measures. The FT deficiency was especially severe in 
reading. 

Because of data problems with the NFT sample, analysis of parent 
impacts was not possible for this project- The teacher data show that 
FT teachers were less satisfied with their jobs, had fewer book resources, 
and had less teaching experience than NFT teachers. FT teachers reported 
more freedom to choose their assignments than did NFT teachers. The two 
groups were virtually identical on the closeness to the community and the 
number of helpers in the classroom variables. 
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Pupil outcomes show FT-favoring trends on all test variables, with 
the difference reaching significance on the quantitative skills measure. 
The teacher outcomes favor FT on both the acceptance of methods and image 
of parents variables, with the latter effect reaching significance. Since 
the FT-favoring difference on acceptance of method approached significance, 
we interpret these results as consistent with the goals of the model. 
Since this project both incorporated the EDC model and operated as a 
parent-implemented, self-sponsored project, one would expect a high degree 
of autonomy and teacher assistance to prevail. 

The overal 1 picture presented by the two-year data for Cohort I shows 
modest, but encouraging, impacts on the children^ The positive impacts on 
teachers may help to sustain and improve impacts on children in subsequent 
cohort groups. The lack of parent and classroom observation data for this 
project is unfortunate. It would be interesting to relate these measures 
to the teacher and pupil outcome variables. 

First-year data are also available for Cohort I-K, kindergarten children 
in this project (see Table 72). This group of children was slightly dif- 
ferent from the second-year group, both racially and in amount of pre- 
school experience. The majority of the FT children in the one-year group 
were non-Black, while the NFT group had a slight Black majority. The FT 
group also had a higher incidence of preschool experience than their NFT 
comparisons. 

The two groups distinctly lack comparability on demographic and 
baseline test data. On parental employment and the presence of a male 
head of household, the FT groups averaged well below the NFT group. How- 
ever, the samples were virtually identical on poverty eligibility. Also, 
FT children averaged well below NFT children on all baseline tests. 

The adjusted outcome measures for the first-year data reveal ifio 
significant differences between the FT and NFT samples. A comparison 
of adjusted differences for the one- and two-year child outcome data 
suggests improvements during the second year in every outcome but 
attendance. \Vhile only one measure reaches significance after the second 
year, two-year outcomes consistently display FT-favcring trends. These 
trends should, however, be interpreted with caution. The large baseline 
differences that existed between FT and NFT samples restrict considerably 
the confidence with which we can interpret these data. 
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Project ED(b) 

With more than 1,100 pupils, this project is one of the largest 
included in the interim evaluation. The project is located in a large 
middle Atlantic city (more than 4 million residents) with a substantial 
minority population (about one third of the inhabitants are nonwhite). 
Children in this project enter school in kindergarten. Data are available 
on both Cohort I-K (first- and second-year effects) and Cohort II-K (first- 
year effects). The Cohort I, two-year data, presented in Table 73, will 
be considered first. 

Covar iable data ind icate that ch ildren in the nine FT and five NFT 
classrooms that were incliir^ed in analysis of child outcomes were well 
matched, both on backgr.our^i data and on tests taken shortly after they 
entered kindergarten. The two groups display little difference in terms 
of age, race (almost all children wore Black), or percentages of boys and 
girls in the samples. Differences in demographic variables were also 
relatively small. About half of the parents in both groups lacked high 
school diplomas, and employment of head of house was indicated for be- 
tween two thirds and three fourths of the cases, with slightly more than 
25 percent of both samples employed in skilled jobs. Male heads of house - 
hold were present in about 65 percent of both samples, and the poverty 
level for FT was slightly higher than that for NFT (61 percent, compared to 
55 percent). A difference in the amount of preschool, experience was 
evident, however; 54 percent of the FT sample reported preschool ex- 
periences, while only 17 percent of the NFT sample did. The FT and NFT 
scores on pupil baseline tests were very similar. 

The baseline data on the teacher variables suggest that the two 
teacher groups wero fairly comparable. NFT teachers were more likely to 
resemble their students ethnically and were closer to the community than 



teachers. They also resported slightly more training and experience. 
FT teachers, however^ had greater freedom to choose teaching assignments 
and had more aides in the classroom, whereas NFT teachers had somewhat 
more book resources available. Strong satisfaction with working conditions 
was not noted for either group. 

Because of the reasonably good FT/NFT match, adjustments to child 
outcome measures had only minor impacts on FT/NFT unadjusted differences. 
An inspection of Table 73 reveals that none of the pupil outcome variables 
differed reliably from chance expectation. A similar conclusion is 
apparent in the parent and teacher outcome variables. Neither of these 
analyses indicated significant differences between the FT and NFT samples. 
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Classroom observation data, collected during the second year, show 
FT classrooms scoring very high on the self-regulatory factor. This is 
consistent with the model. The av^erage or low scoies on the other 
factors make plausible the lack of striking findings on child outcomes, 
although no data on NFT classroom processes are available for comparison. 

First-year effects for Cohort I-K are summarized in Table 74, The 
outcomes for this analysis reveal a single significant difference, NFT 
students had a significantly lower level of absenteeism than did FT 
students. 

Child covariable and outcome data for Cohort II are included in 
Table 75. The FT and NFT samples for Cohort II-K are slightly less com- 
parable than the samples for Cohort I-K. Although the proportion of 
parents reporting high school educations is somewhat higher for FT than 
for NFT, more FT families were poverty eligible (38 percent FT, 28 percent 
NFT). The higher poverty level in the FT group is consistent with the 
greater absence of male heads of household and the higher unemployment 
and unskilled employment noted in this sample. FT children also had, on 
the average, slightly (half a month) less preschool than their NFT counter- 
parts. These differences in the demographic variables were not reflected 
in pupil baseline measures; the FT children performed about as well on 
basel ine tests as did the NFT children. In fact, their scores were 
slightly higher on most baseline measures. On the language variable, 
however, the average FT score was 2.3 points below the average NFT score. 

All of the adjusted FT/NFT outcorne measures favored NFT except at- 
tendance, which shows less absenteeism for FT, The only measure showing 
significant differences was quantitative skills. 

The classroom observation data, available for FT classes only, show 
a pattern similar to the one for Cohort I-K, although even more pron*. ..need. 
That is, consistent with the model, the self-regulatory factor score is 
very high and the scores on the other factors, particularly chi ia-ini t i a ted 
interactions and self-learning (both connoting a classroom in which adults 
do not initiate contacts with children') are low. 

Project ED(a) 

This project is located 50 to 70 miles from a south Atlantic SMSA 
With a population of half a million. The population is about 20 percent 
nonwhite. Included in the interim evaluation are one- and two-year 
results for children in Cohort I-EF. First grade is the entering year in 
this rural community. 
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Co variable data describing the children in Cohort I-EF , presented in 
Table 76, indicate a fairly severe mismatch between the FT and NET groups. 
On demographic variables, FT children appear less disadvantaged than 
project children at some other sites, but much more disadvantaged than 
the NFT children who made up the comparison group. The typical family 
in both groups included an employed male head of house. However, about 
half the FT parents lacked high school diplomas, while the majority 
(about 80 percent) of the NFT parents had completed high school. Level 
of employment was high for both groups, but more skilled occupations were 
associated with NFT families (75 percent versus 55 percent) . FT children 
were more often Black and more frequently came from poverty-eligible homes. 
The majority of both samples reported preschool experience, with a slightly 
higher experience rate in the FT sample. On baseline test measures, NFT 
chilfdren performed consistently better than FT children, with the most 
striking differences in the areas of reading, language, and affect. 

The baseline data for the six FT and six NFT teachers show that FT 
teachers had less experience, less freedom to choose their teaching 
assignment, and, surprisingly, fewer classroom helpers than NFT teachers. 
FT teachers did report more book resources and slightly more satisfaction 
with working conditions . 

There were no significant differences between FT and NFT children on 
any of the pupil outcome variables. The slight differences consistently 
favored NFT on all of the test variables, but attendance favored FT, 

The parent outcome analysis yielded on,e significant difference. FT 
parents reported significantly more involvement in school than NFT parents, . 
No other differences on parent or teacher outcomes reached significance. 

The significant parent /school involvement outcome is consistent with 
the high degree of parental involvement suggested by the large PAC for 
this project during the 1970-71 school year. 

One-year data on these Cohort I-EF children are presented in Table 
77. The adjusted pupil outcome measures generally reveal minimal differ- 
ences between FT and NFT pupils^ The one exception Is the significant 
advantage noted for the FT children on the cognitive processes measure. 

In summary, it appears that this program has had minimal impact on 
pupTi and teacher outcomes in this project. The lack of classroom 
observation data at this site makes interpretation difficult. Data 
collected from parents, on the other hand, are encouraging. 



230 



^ 5 



o to i~> 



00 T C » 



n n rr oi ao 
t>- 01 n O 



< S 



n o M M en in o 

O O O C5 O N 



T in oo in 



int-'0ia)oooioo«5(oo 
citp^rrciNinTinm 



f-i «}• 



r-Oinnc^iorrfo 



10 o in n o 
(O'-in^.-iriwin 



(O «r T T 



O) to OO CM 



t>- CD T « N 

n M f-< oi in 01 
t>- N oi N oi « 



oi t>- X c^ o in 

r4 d J< 



^8 

cr u 



as s 

8 g : 

, « OS 

to to 

■ 3 2 

u u 



ii 

9 3 



' o " 2 



UJ U5 U5 U5 O O 

* H H H H as as 

u z '-a z 

2 3 a S S § S 

« < < 5^ ? M H 

0. a 0. 0^ cu a be 



< < l>», b% b«i fc'i b»; 



s & s 

§ ^ s 

b: :s Q 

o 2 

E M o ; 

8 3 i ! 

oc U O 1 

U5 tn 

w W I 

< H X I 



S 5 : 



tn P 

5 S u 

6 & >^ 

[iu Ci^ M ^ 

w S 9 9 

U5 f.3 O O 



H H X X 

^ b: b: 

„ O O W Q Q 

• •^^£cqS2Sx 

o > 

T< < ftft t« fc*- 6^ 



g ^ 2 

O CQ 

tn V, 



o 2 (o M b: 



o < 

to J 

b: • 

ui Ui v: 

< Q 

Q « »-* 



M U 
X 

o 



ERIC 



231 



CO 



I 

g 

O 



M 



M 
















as 




o 




u 






o 










H 


H 


U 


1 


W 




►-3 


to 


o 






H 






c; 








u 





M 

g 

i 



04 



o 



Il4 
Il4 



H 


in 


CD 


00 


CM 


CD 


t> 


CM 


H 
tH 


H 


in 


in 


CM 




00 


H 














CO 




H 
1 


CM 

1 


in 


CO 

1 


H 

r 




H 

r 


H 
1 


CO 


CM 

Q 


o 


CO 




CO 


CD 

in 


00 
CD 


CO 


H 


CM 


CM 






CM 










CM 












1 




H 






CO 


1 


CM 




a 


H 




in 




<yi 


CM 
H 


H 


in 


^ 
CD 


CO 




H 


in 

H 


O 




O 


CO 






H 


00 


CO 
H 


H 


CD 


in 

CD 


CO 


00 


in 


in 

H 


CO 


CO 


H 


CD 


in 


CO 


o 


O 


in 
1 


H 

r 


CO 


CM 
1 


CM 

r 




r 


CM 
1 


0^ 


00 




ID 


CM 


a 


CM 


in 


00 
H 


H 


in 


CM 


O 


UO 


O 
00 


a 

iH 


U3 


in 


in 


o 




CM 


CM 


in 


CO 
H 


00 
H 


00 


o 


CO 




a 
t> 


H 



H 
U 
W 
1^ 
1^ 
< 



H 
O 
H 



^3 



w 

W 
W 

u 

s 

M 

H 

§ 
o 
u 



M 

O 



q 

H 

I 



CM CM 
I I 



00 H H in H 



00 in 00 in 



00 CD 00 

^ 



§ 

u 



o 



s 

w _ 
www 

2 S S 
'■^ * s 



5z; cy 



w 

O 

o u 
u w 

W W 

w 

o o 
5z; < 

S § 

sis 



00 


CM 


CD 




in 




O CM 




Gi 


O 


H 


00 


o 


00 ^ 






CM 


H 


CM 


CM 
1 


H CM 


H 


Gi 


Gi 


00 


Gi 


00 


CD CD 


CD 


H 




Gi 


CM 


in 


^ in 


0^ 


^ 




t> 


CM 






0^ 


H 


in 


in 


^ 


Gi 


CD 00 


CD 


H 


in 


H 


H 




CM Gi 




in 


CM 


0^ 


in 


in 


CM CM 


























& 


PQ 








w 




u 


M 








o 




u 


O 










^" 


o 


M 
























n 


Q 


w 








d 


Q 


w 


















w 










r: 






< 


§ 




l-H 


















H 










>w 


Si 


















1 




i 












O 




W 


w 


w w 


1 


!SR 


w 


;cH 


;nt 


:nt 





w 

H 

< 

w 
o 
< 



CM CM CM 



w w 



Q 

w 
>^ 
o 

^ w 

11 

o o 

w w 
w w 

o o 

Q Q 



0^ 0^ n s 

^ ^ 



•H 



c 



rH 



O 

c 
o 

•H 

■P 

cd 
c 
a 

rH 

u 
o 



3 



a> 



3 
o 



o 

W 



ERIC 



232 



Summary 



Separate summary analyses on the EDC Open Education Program are 
reported only for Cohort. I-K, two-year groups, which include two project 
samples* Sponsor sumjnaries based on a single project are not repeated 
in this section. 

The salient features of this FT approach are summarized below: 

Focus and Objectives — long range program objectives. 

Child 

Cognitive 

Develop competence in basic skills 
Promote problem solving skills 

Affective 

Develop ability in self-expression 
Develop self-direction 

Curricular Approach 

Teacher's role that of facilitator 
Reinforcement primarily from activities 

Child generally free to choose among wide variety of activities 
Individual/small group focus 

Type of Parent Involvement 
Inform parents about program 

Teachers use parents as resource in planning child's education 
Some form of decision making 

The results of the outcome analysis are presented in Table 78* An 
inspection of the table indicates that the program produced no significant 
FT-favoring results on the child outcome measures* The FT group performed 
slightly better on four individual variables, the NFT group, on three 
variables* Results of parent outcome analyses show KFT parents reported 
interacting with their children to a significantly greater extent than 
did FT parents. No other parent outcome differences reached significance. 

The analysis of teacher outcome variables at the sponsor level suggests 
that there were no reliable differences between FT and NFT teachers. How- 
ever, classroom observation process data indicate that the model is being 
implemented according to its specified goals. 
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Interpretation and evaluation of this general lack of impact evidence 
for this sponsor, and for the individual projects, must include considera- 
tion of the occasionally severe lack of comparability of these FT and NFT 
samples, which, we suspect, seriously confounds the analysis of effects. 
Hence, it is possible that many results were undetected or are grossly 
underestimated on the basis of these data. 




INTERDEPENDENT LEARNING MODEL 
New York University 



S ponsor* s Intended Approach 

The Interdependent Learning Model (ILM) is a transactional approach 
to education that focus^^s on the learner as an individual and on the so- 
cial interactional context within which learning occurs. It contains 
elements of both the open classroom and individualized program approaches, 
but is distinguished by its strong focus on small group interaction as 
the basic structure out of which learning emerges. This derives from the 
conviction that a child gains most of his knowledge from interaction within 
his family and with his peers rather than while sitting at p. desk. If 
education is truly preparation for life, the theory goes, it needs to be 
more life-like in its structure. 

J^'M, for example, advocates an emergent approach to language develop- 
ment in which communication rather than language per se is stressed. A 
child develops language proficiency by being presented with situations of 
increasing complexity that motivate him to express himself verbally. Lan- 
guage emerges from situations rather than being prescribed. Games and 
game-like activities play a major role in bringing this about. 

Games are a central feature of the ILM model, often being used in 
Combination with certain aspects of programmed instruction to achieve 
instructional and social objectives. Since the focus is on "learning to 
learn," curriculum content is not specific, although suggested games deal- 
ing with specific content areas, such as language, are being developed. 
In introducing new games the teacher typically follows a strategy of teach- 
ing from within; she demonstrates how to play by actually playing the game 
with a group, verbalizing what is being done and why and serving as a model 
rather than actually teaching; ultimately she transfers much of the control 
to the game rules, encouraging the children to direct their own learning. 

The advantages seen in games further defines the philosophy of this 
approach. They can be played by individuals with different levels of 
competence, with the more advanced helping the others. They provide feed- 
back to the child both by way of the game materials themselves and from 
the other participants; the child monitors the "correctness" of his own 
response as well as that of others. Games can approximate eveuts in "real 
life" minus the risk factor. Starting with the benefit of game rules, 
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groups can be quickly formed and sustained with minimal adult direction. 
Thus, children can be led to assume increasing responsibility for making 
choices and managing their own behavior. 

The small group approach is considered just as appropriate for de- 
veloping the teaching role as the learv^ing role in this model. The adults 
in the classroom are consic^erf^J to be a team participating equally in 
decision-making and teaching functions. They are expected to meet with 
other teams to pool ideas, share materials, and provide mutual support. 
The team implements the model gradually, introducing changes in the class- 
room only as the team becomes relatively comfortable with them. 

Joint participation between sponsor and the local project governs 
model imple-.iientation overall. The sponsor helps the local site develop 
its program according to its own needs and objectives through a coordinator 
serving as chief liaison between the site and the sponsor's staff. In 
training sessions, local staff work as apprentices to sponsor consultants 
at the beginning of workshops and take over training sessions by the end 
of the training period. As part of the training, local staff also design 
preservice workshops for their own sites. Responsibility for training 
and implementation is steadily delegated to local staff until the model 
finally functions autonomously. 

ILM considers parents an integral part of the educational teams and 
urges schools to invite them into the classroom to play a real role in 
the educational process and to participate in mode], improvement. The 
game approach allows parents to play leadership roles in the classroom, 
even though their own fonnal education may be limited. Parents unable 
to participate direct!/ in the classroom are encouraged through workshops 
and home visits to learn the instructional games their children are play- 
ing and to play the games with them at home. 
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Individual Project Results 



Three samples from two different projects sponsored by New York 
University (NY) were included in the analysis of interim effects. The 
distribution of these evaluation samples in terms of cohort, outcome, and 
project is as follows: 

Cohort First-year Effects Second-year Effects 

IK (projects a&b) 

I IK (project a) 

Project NY(a) 

Project NY(a) is a very large kindergarten entrance project (1,109 
pupils) implemented in a racially mixed community (52 percent nonwhite) 
in a large south Atlantic urban area (SMSA = 1,173,000). A Cohort I and 
Cohort II sample are included in this project. Second-year data for the 
Cohort I sample are presented in Table 79. Baseline averages for the ten 
FT and twelve NFT classes are quite comparable. In fact, of the many 
projects included in this interim evaluation analysis. Project NY(a) has 
one of the best matches of FT and NFT groups. There is one moderately 
serious problem with these project data. Pupil baseline averages show both 
FT and NFT groups as averaging 0 percent on preschool experience, an in- 
accurate statistic caused by Incomplete roster data obtained in Fall 1969, 
Parent reports show that the 56.8 percent of the FT children and none 
of the NFT children had preschool experience. Although these data are 
also incorrect, it seems more likely that they represent true differences 
in preschool experience. These two errors, however, result in inappropriate 
covariance adjustments in the outcome data. In the case of pupil scores, 
the likely effect is an uuder-adjustment. In the case of the parent out- 
comes, over-adjusting likely resulted. Problems such as tliese seriously 
complicate the task of interpreting analysis results. 

Nevertheless, significant Ft/nFT differences can be noted on the 
quantitative measure and on attendance. Both of these rvi\5ul ts are FT- 
favoring and suggest that FT pupils may be more interefc:ted in school and 
learning more than comparable NFT* pupils. 
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Parent measures fail to display significant FT/nFT differences. 
However, FT teachers reported significantly greater approval of their 
methods than did NFT teachers. This result is mildly surprising, since 
these same FT teachers appeared less pleased with their general working 
conditions and resources and reported fewer helpers than the NFT teachers. 
Perhaps they preceive the appropriateness of their procedures, but feel 
that circumstances still could be better. 

On classroom observation data, FT classes differ from NFT classes 
on the self-regulatory and child self-learning factors. Since both factors 
include independent activity by the children, these scores are in accord 
with the sponsor's advocacy of game materials in tWe curriculm and the 
model ^s focus on the learner and his interactions with peers. The class- 
room averages are not dramatically different from zero on any of the 
factors, however, and do not by themselves assist our understanding of 
the FT child outcome data. 

Data for the Cohort II. one-year sample from Project NY(a) are pre- 
sented in Table 80. FT and NFT groups in this sample, consisting of eight 
FT and two NFT classes, were moderately comparable on baseline measures. 
Some discrepancies appear on parent educational and occupational levels, 
with FT being higher than NFT. Also, it should be noted that not quite 
as many FT families had male heads of household, and many of the mothers 
reported they were currently working. 

In this case, the adjustments for lack of comparability between the 
FT and NFT groups on family-social variables reduced the apparent size 
of FT--favoring differences on pupil outcomes. * This adjustment differs 
from the more common one, which increases FT-favoring differences, because, 
in this project, unlike most, FT parents were less disadvantaged than NFT 
parents. We present these comments only to allay concern that "true" 
differences are being obscured by the analysis. We do not believe that 
they are; there just do not appear to be any significant results — either 
pupil or parent — for the one-year, Cohort II outcomes in this project. 

The classroom observation factors scores do not give us a consistent 
picture of implementation nor do they illuminate the lack of differences 
between child outcome scores for the two groups. While the FT spore on the 
programmed academic factor was higher than the extremely low NFT score, FT 
classes were lower on that factor than FT classes in most other projects. 
Consistent with the model, FT classes differed considerably from compari- 
son classes on the self- regulatory and child self- learning factors, but 
on the latter the average FT score was below the general mean and well 
below the mean of Cohort I classrooms in the same project. 



241 



g 



U M 



i 



r-i m 



Oi CM CM rH in CM 
(O (D O O rH O H 



H IH M 



N<or- oooomao 



c4 i- OS > 0, is 

< sr. < 55 KM M 

oir^'OcoOrHoomoo 



in in o n CM 
n n oi T-t to 



omoinm mocMCMr- 

H O O O 



o in in o o 
CM X rM o o 



Own 

O rH 

in 00 in 



CD rH CD IfJO^f-O) 



MoonnoHinrH 



H m ci 






242 



Project NY(b) 



Project NY(b) is a '^big city^' project located in the huge mid- 
Atlantic population center. The project is moderately small with near 
adverage PAC size and anticipated per-pupil expenditure. Nearly all 
pupils for whom data were analyzed were Black, 

The evaluation sample from NY(b) consists of second-year data for 
Cohort I pupils. These data are presented in Table 81. The FT and NFT 
groups are moderately comparable. Again, FT pupils are superior on 
baseline tests and their families were less disadvantaged than the NFT 
f ami lies . 

Pupil measures show that NFT pupils had significantly better attendance 
than FT pupils. No other pupil differences reach significance. In addition, 
none of the parent measures reveal significant FT effects. This result 
suggests that the parents of this project sample may rot have reached the 
level of involvement and participation emphasized by the model at the 
time data were gathered. FT teachers, however, displayed .evidence of 
significantly greater approval and acceptance of their ruethoas (presumably 
the model's approach) than the NFT teachers, and since it seems likely 
that the long-range success of this kind of program is highly dependent 
on positive teacher regard and compliance, we interpret this teacher 
outcome as encouraging for the model. 

Summary 

The salient features of the New York University approach to Follow 
Through can be outlined as follows: 

Focus and Objectives - — emphasi zes long term program ob j ect ives 

Child 

Cognit ive 

Develop problem solving ability 
Develop language competence 

Affect i ve 

Develop self-direction 
Develop cooper at ive uehavior 



243 



I 



§ 3 



CD h- 
to h- 



N rr 



a "-I D, "-I 

N U3 tn 
N fj n « o 



CO m « « 
.-I n o ID o 



CO CM O M in 

O rH O O rH O d 



CD tn 

CD C^l N 



o o 1/5 



O in u5 
o ^ N tn 



01 O U 



Nh-nnt^inoot^i^ONh-cnNOco 



ID O iTJ 



30 in o T 



(O n O M O N O 



8 



Z • Q O 

3 8 2 3 

or U 



" S 



S (u 8 8 8 



9 a 



>j is -t' CQ a 



CO tn u ir; ic S5 



tn cn 

3 a 



a a S 



U. b] u u a 

< < fcP- lj« t° 



b« 



. 3 

u u 

8 k. 

Q ° 



k, w S 

° 2 9 9 

tn o O O 

O >-i X X 

^ -3 w w 

O w cn cn 

SC D D 

t-r >. O O 

H f- a: a: 



b] u: cc cs 



U a O cn cn 
• ' 'Si -K a. 



^ g ? ? 

^ s g g a 



b^ b". 



ass 



M 2: 
Q u w 

§ S 2 



ERIC 



244 



Currlcular Approach 



Teacher's role that of facilitator 
Reinforcement primarily from activities 

Combines programmed instruction elements with central use of games 
Teacher demonstrates games, then gradually withdijpws, encouraging 

children to direct their own learning 
Stresses small group interaction of children 

Type of Parent Involvment 

Encourages parents to participate in the classroom 
Provides home visits and workshops so that pa rents can learn 
ins true t ional games and play them at home with chi Idren 

Since Cohort II evidence is based on a single project, summary analysis 
data are reported only for the Cohort I samples for this sponsor. These 
results, as presented in Table 82 show significant differences on quanti- 
tative skills, parent/school involvement, parent sense of control, and 
teacher acceptance of the method. The pupil achievement outcome is con- 
sistent with the model's emphasis on the development of problem-solving 
ability, but there is no evidence indicating attainment of the language 
objectives. 

We interpret these Cohort I and Cohort II results for this project • 
as noncontradic tory and, perhaps, even compatible. It is wholly possible 
that the bases for academic and social growth are being developed during 
the first year o/ two, and that, consistent with the goals of the model, 
large performance differences would be expected to accrue only in advanced 
primary grades. 

The parent result is consistent with the model's emphasis on parent 
participation in the classroom. The teacher approval outcome does suggest 
that the model is viable and, presumably, fairly v^^ell implemented .in the 
projects studied. Overall, these results are favorable, and we interpret 
them as positive evidence that the approach is meeting many of its ob- 
jectives. The difficulty with this interpretation is that it is based on 
evidence from only two projects in the Cohort I-K sample and, hence, cannot 
be considered conclusive at this point in the evaluation. 
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LANGUAGE DEVELOPWGENT (BILINGUAL) APPROACH 
Southwest Educational Development Laboratory 

Sponsor's Intended Approach 

The Southwest Educational Development Laboratory model is a bilingual 
approach first developed for classrooms in which 75 percent of the pupils 
are Spanish-speaking, but it can be adapted by local school staffs for 
other population mixes. In all cases the model emphasizes language as 
the main tool for dealing with environment, expressing feelings, and ac- 
quiring skills, including nonlinguistic skills. Pride in cultural back- 
ground, facility and literacy in both the native language and English, 
and a high frequency of ^*success*' experiences are all central objectives. 

The theory applied by the model is that learning in a second language 
is easier and more effective if the child first learns concepts in his 
native language. Step-by-step sequential procedures are followed in teach- 
ing language patterns, and both teaching techniques and materials are de- 
signed to develop a hierarchy of thinking procssses, specific terminology, 
and symbols. Drills, games, and exercises are used to overcome individual 
linguistic problems. 

Focusing on content in teaching language, all classroom activities 
reinforce language development. The Kindergarten program concentrates on 
the following skill areas: visual, auditory, motor, thinking and reasoning, 
discovering and exploring, and English language structures. Oral com- 
munication precedes reading and writing in the First and Second Grades. 
The responsibility for instruction is on the teacher rather than on spec- 
ified texts. The Third Grade component of the model serves as a transi- 
tion, guiding the teacher to adapt standard curricula to the unique needs 
of the bilingual children, thus preparing them to function effectively in 
a traditional Fourth Grade. 

The model stresses a high degree of adult-child contact. Teachers 
and aides are constant language models, assuring the child he can succeed 
and reinforcing him with recognition and praise. Kindergarten classes 
are usually divided into three or four groups, with the teacher and aide 
working with one group while the other groups work independently. All 
groups cover the same material, but those progressing more rapidly are 
given expanded materials. In the First and Second Grade classes, the 
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teacher presents a lesson to the whole group with visual aids and books, 
and then the children work in small groups or as individuals with enrich- 
ment materials based on the lesson. 

Optimal staffing includes a bilingual teacher skilled in the method- 
ology of second-language teaching and a bilingual aide in each classroom. 
Staff development coordination and evaluation activities are also required 
of local project staff. Staff development aimed at continuous professionaJ 
development of district teachers and administrators is a supporting com- 
ponent of the model. Summer training workshops foi' local Staff Develop- 
ment Coordinators result in ongoing training and assistance at the project 
site. The Southwest Educational Development Laboratory has designed a 
series of training modules that include manuals, video tapes, and film- 
strips to help teachers implement curriculum mat*^rials in a way consis- 
tent with the cultural and linguistic needs of the child. 

The model seeks to accelerate the child' s success at school by en- 
couraging a positive expectation of achievement in the paz^ent , and parents 
are invited to take part in classroom activities. Parent involvement is 
regarded as essential, and special materials are available fo.r the parent 
to use at home to reinforce the child' s Kindergarten experience. 

During the past three years, the model has been modified and improved 
on the basis of pupil progress reports, teacher feedback, and other forma- 
tive evaluation data. 

Individual Project Results 

Only one project sample for Southwest Educational Development Labora- 
tory (SW) was included in the interim analysis data base. Two-year effects 
data for this Cohort I project are summarized in Table 83. This moder- 
ately large project (854 pupils) is in a large mid-Atlantic urban region 
(SMSA = 4,021,000). The anticipated per-pupil FT expenditure foz^ this 
project of $752 is slightly below the overall average, and the pupil/PAC 
ratio of about 18 to 1 is near average. 

Comparison of the eight FT and four NFT classes included in this pro- 
ject analysis shows that the groups lack comparability on baseline scores, 
ethnic composition, preschool experience for the classrooms, and most 
parent-level variables. FT pupils averaged substantially below NFT pupils 
on cognitive process, reading, and language measures. NFT classrooms had 
hif^her proportions of Black pupils and lower proportions of preschool ex- 
perienced pupils. The- general pattern of greater disadvantage for FT 
families also prevailed in this project sample. Indeed, the particular 
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values for these parent variables (low income, education, and occupational 
levels, amd high proportions of unemployed and female heads of household) 
indicate that the sample included a large number of broken or father- 
absent homes. Nevertheless, these FT families rate their child* s progress 
substantially more favorably than do NFT families. 

Comparison of means for teacher variables show the FT teachers re- 
ported fewer book resources but more helpers, and they apparently had 
greater freedom to choose assignments than did NFT teachers. The groups 
were moderately well matched on the training and experience scale. 

Results of analysis of pupil outcomes show significant FT-favoring 
differences on overall achievement and the WRAT total score. The .95 con- 
fidence intervals were 7.4 to 42.4 and 5,4 to 24.2 points, respectively. 
The specific academic areas where these differences appear concentrated 
are quantitative and reading skills. Neither parent nor teacher program 
effects reached significance, and since this project was not included in 
the classroom observation sample, process data are not available to aid 
in interpreting these results. 

Summary 

The salient features of the Southwest Educational Development Labora- 
tory (SW) model are summarized as follows : 

Focus and Objectives — emphasizes intermediate program objectives 

Child 

Cognitive 

Develop bilingual competence 

Affective 

Increase self-expression 
Develop positive expect ation of success 

Curricular Approach 

Teacher's role that of facilitator 

Teachers offer reinforcement and do so frequently with recog- 
nition and praise 
Small group focus 
Programmed materials used 

Emphasis on high degree of adult-child contact 

Type of Parent Involvement 

Urge parents to have expectation of success (achievement) 
for child 

Parents participate in classroom activities 
Provide special materials for parent use# 
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These objectives and methods chaiacterize the approach as nontradi- 
ticnal, yet, interestingly enough, the significant results in favor of 
the model's effectiveness in the single project evaluated are in tradi- 
tional academic areas of reading and mathematics achievement. There is 
no independent evidence that the model was implemented as planned, but 
the model's emphasis on th'=^ use of programmed materials, teacher rein- 
forcement, small group instruction and high adult-child interactions 
appears effective in promoting positive academic growth with these inner- 
city poor children. On the other hand, these results show that the model's 
emphasis on parental enthusiasm and involvement appears not to have met 
with success in this project. Moreover, noncogniti ve objectives involv- 
ing development of attitudes and aspirations within the children are not 
eyident in these data. One could further argue that teachers do not 
exhibit strong preferences for this approach. They also do not reflect 
a particularly positive image of the parents' role in extramural educa- 
tional activities, as evidenced by the responses to teacher questionnaire 
variables. This last finding may, in part, be due to the relatively low 
number of helpers and resources in this project. 

Since all the above inferences are based on data from a single sample 
within a single project, we feel no conclusions can be justified at this 
time. At best, the model as implemented in this sample project seems to 
be producing positive achievement gains for pupils. 
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SELF-SPONSORED AND PARENT- I]\IPLE]\IENTED PROJECTS 



Sponsor's Intended Approach 

Six of the early group of pilot projects that preceded the planned 
variation phase of Follow Through elected to remain unsponsored. They 
were the only pro j ect s included in this evaluation given this opt ion, 
Tliey are classified as "self-sponsored" or "parent-implemented" models 
and have instituted programs that they themselves have developed. Since 
a variety of different models exist, it is inappropriate to analyze these 
projects at the sponsor level. Therefore, only project results are pre- 
sented. Even at this level, interpretation is complicated by a lack of 
stated objectives. Where significant results have occurred, there is no 
way to determine whether they are desired results. 

Individual Project Results 

Twelve samples from six different self-sponsored (SS) or parent- 
implemented (PI) projects were included in the analysis ol interim effects. 
Of these, five are sel f -^sponsored , and one is parent-implemented. The 
distribution of these evaluation samples in terms of cohort, outcome, 
and project is as follows: 

Cohort Ist-year Effects 2nd-year Effects 

IK (projects b, c, d, e, and PI) (projects bj c, d, e, and PI) 
IE (project a) (project a) 

Project SS(a) 

Located within an SMSA of 1.2 million people in the south Atlantic 
region, Project SS(a) is predominantly white and.slightly larger than 
average. The anticipated FT per-pupil expenditure of $732 is slightly 
below average, and the pupil/PAC ratio was 27,8 to one. As is the case 
with all other self -sponsored projects, classroom observation data were 
not collected. 
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Table 84 presents the two-year impact data and results for Project 
SS(a), Cohort I-EF. These results are based on five FT and sixteen NFT 
classrooms. Although FT and NFT pup j Is had comparable scores on baseline 
tests, the large difference in the number of classrooms of each type cremates 
a poor base for comparison. More significantly, the NFT classes were 
predominantly Black, while FT classes were predominantly non-Black. Also, 
NFT classes had a higher proportion of boys, and more FT pupils had haa 
preschool experience. 

FT and NFT families were more comparable on some socioeconomic vari- 
a.bles. Both FT and NFT parents showed low educational attainment, employ- 
ment in unskilled occupations, high percentages of poverty eligibility, 
and high percentages ol male heads o± household. Nevertheless, more NFT 
parents were employed in skilled occupations than were FT parents, and 
more NFT heads of household were employed. Both groups responded favor- 
ably to the child* s academic progress in nearly equal proportions. As 
noted in the child sample discrepancies, preschool and ethnicity variables 
differed greatly. FT parents were predominantly white and reported more 
preschool for their children. Conversely, NFT parents were predominantly 
Black and reported less preschool. Investigations revealed that the FT 
families were primarily Spanish speaking (likely Cubans or Puerto llicans) , 
indicating a.n additional cultural bias between the two subgroups. How- 
ever, these differences do not appear to have affected the language per- 
formance of the children, since the two groups are nearly equal on langu- 
age prescore averages, 

FT and NFT teachers (four FT, two NFT) were nearly equivalent on 
variables such as job satisfaction, resources (books and helpers), and 
experience. However, more FT teachers were Black, fewer lived within 
the school community, and fewer were allowed to choose their assignments 
than NFT teachers. 

Analysis of covariance on child level outcomes showed a significant 
difference only for the quantitative measure. The confidence interval 
indicates a 95 percent probability that the true difference is somewhere 
between 1,2 and 13.2 units in favor of FT pupils. Alt* ough other measures 
teni to favor FT, none of the differences are significant. 

Parent variables indicate a trend that favors the FT group. Differ- 
ences on the parent/child and parent/school interaction variables reach 
significance (95 percent confidence interval = ,14-^1.31 and .23-1.43 
respectively). In light of the high number of pupils (27.8) per PAC 
member, such results are interesting. 
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Neither teacher result reached significance on this project. Since 
no classroom observation data are available, little can be stated beyond 
the previously noted control variable pattern at the teacher level. FT 
teachers scored higher than NFT teachers on both measures. The difference 
approaches, but does not reach, significance on the parent image variable. 

Without knowing the specific objectives of the model , we can only 
note that two years of implementation show gain for the Follow Through 
group in child quantitative ability and parent interactions with the 
child and the school. Such results show that the FT program had some 
impact. Additional ir formation is needed if this impact is to be further 
assessed. 

The results of the analysis of first-year effects on children in 
Cohort I-EF nre summarized in Table 85. Comparison of the first- and 
second-year results shows that "^he FT pupils did make greater gains in 
Spring, 1971, than in Spring, 1970. All differences favored FT in Spring, 
1971, while moc^t differences favored NFT in Spring 1970 (although no 
Spring 1970 difference reached significance). 



Project SS(b) 

Project SS(b) is of moderate size and is located within a large urban 
area in the east north central regicn. The anticipated FT per-p" pil 
expenditure of $1,183 is well above average, and the project had a lower 
than average number of pupils (11.4) per PAC member. 

Table 86 summarizes the two-year impac^ data and results for Project 
SS(b), Cohort I-K. These results are based on four FT and nine NFT class- 
rooms. Aside from a slightly higher proportion of males in the FT child 
sample and a much higher proportion of FT preschool experience (lOO percent) 
the samples represei"^ a good FT/KFT match. B:.seline test measures are 
nearly equivalent, £,nd children in both groups were predominantly Black. 

Values on parent and teacher control variables al so suggest a reason- 
ably good FT/NFT match for this project. Pare.it samples are fairly com*^ 
parable on percentage with high school diplomas (low) , percentage with 
skilled occupations (low), percentage who were poverty eligibile (high), 
and percentage with head of household employed (low). The parents were 
predominantly Black, and most households did not have male heads. Few dif- 
ferences between the two groups of teachers are evident. The FT teachers 
did h£^ve more book resources- and were less integrated into their pupils* 
communities than NFT teachers. 
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Outcome analyses for pupil measures reveal significance only for 
aT t;endance , which is FT-favoring, Although FT classes tended to score 
higher than NFT classes on basic skill measures, none of these differences 
reached significance. Moreover, none of the variables on either the parent 
or the teacher impact analyses ?;eached signif icance . This lack of clear 
evidence of impact suggests that this project was not very effective, in 
spite of its high expenditures and large PAC, 

Table 87 presents the results of the one-year effects analysis on 
children in Cohort I, At the end of one year of FT experience, signifi- 
cant FT-favoring differences were found on measures of achievement, affect, 
quantitative, and language. The 95 percent confidence interval for achieve 
ment ranges from .8 to 22.4 units; for affect, from .5 to 5- 4; for quanti- 
tative, from ,2 to 6.4; and, for language, from ,4 to 3,6 units. Just 
why this sample failed to maintain its growth rate is far from clear. 
Perhaps changes took place within the project, or perhaps the methods 
employed produce onlj'' short-term gains. Without additional data, only 
speculative explanations can be offered. 

Project SS(c) 

Project SS(c) is a very large FT program located in a large city 
(population 4 million) in the middle Atlantic region. The anticipated FT 
per-pupil expenditure of $631' is below average, and there was a slightly 
below "average number of pupils (12.2) per PAC member. 

Table 88 summarizes the two-year results for Cohort I-K, These ^ 
results are based on twelve FT and seven NFT classrooms. Although FT 
classes are sliq:htly higher than NFT classes on baseline test averages 
and preschool experience, the groups are highly comparable on all other 
variables. The two samples have approximately the same distribution of 
boys and girls. Children in the groups were about the same age and were 
predominantly Black, 

Both parent groups were predominantly Black and few parents in either 
group had high school diplomas. Most parents in both groups listed un- 
skilled occupations, and a high proportion of both were poverty eligible. 
Although a slightly higher percentage of ^^^w Through parents had high 
school diplomas and a smaller percentage were poverty eligible, FT parents 
were more likely to be employed than NFT parents, but NFT' parents were 
more likeljr to have a skilled occupation. In addition, a higher percent- 
age of FT households had male heads. Although those differences that do 
exist favor FT, the groups were socioeconomically similar enough to in- 
dicate comparabil ity. Teacher data for this project were insufficient 
for analysis. 
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Analysis of child outcomes showed significant FT/NFT differences 
on three measures — achievement, the \VRAT, and reading — all in favor of 
FT. The 95 percent confidence interval for achievement ranges from 1,6 
to 27.0 units; for the \VRAT, from 2.7 to 16.5; and, for reading, from 
2.5 to 15.1 units. The overall trend was FT-favoring on all variables. 

Program impact on parents failed to reach significance. In fact, 
FT parents scored lower than NFT parents on all measures except parent 
expectation. This result is consistent with the FT parents* positive 
attitude about their children's academic progress shown on baseline scores. 

Analysis of the one-year Cohort I-K effects on children (Table 89) 
shows a progressive gain for the FT groups between Spring 1970 and Spring 
197^. At the end of the first year of experience, the FT classes were 
below the NFT classes on all but the -cognitive process measure. In fact, 
the affect measure showed a significant difference in favor of the NFT 
group'. Another year's experience produced not only FT-favoring results, 
but also significant differences favoring FT on achievement, the WRAT, 
and reading. 

Since we do not know the specific instructional components or pro- 
cedures associated with the project, we can only speculate about the 
reasons for its apparent success • The FT per-pupil expenditure on this 
project was below average. Perhaps the hi'^h degree of PAC participation 
was an important factor. 

Project SS(d) 

Project SS(d) is located in a city of moderare size (825,000) in the 
Pacific region. This relatively ^ arge project anticipated a below average 
FT per-pupil expenditure of $513 and a large number (56) of pupils per 
PAC member. 

Table 90 presents the two-year results for Cohort I-K in Project SS(d). 
These results are based on twelve FT and seven NFT classrooms. Although 
pupils in both FT and NFT classes were predominantly Black and about the 
same age, the FT classes show a slightly higher proportion of boys and 
preschool experience more than four times greater than that of the NFT 
group. More importantly, the FT group scored higher on all baseline 
measures, particularly the reading and quantitative factors. This dis" 
crepancy indicates that the incoming abilities of the two groups were 
not comparable. 
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T'ese groups were somewhat more comparable on parent factors. Both 
were primarily Black with unskilled occupat ions , fairly low educational 
attainment, moderately high poverty eligibility, and high unemployment. 
The NFT sample contained a higher percentage of Black parents. The groups 
also differed an presence of a male as head of household; there was a 
higher percentage in the NFT families, 

FT and NFT teachers were nearly equal on rated job satisfaction, 
amount of resources (books and helpers) , residence outside the school 
community, freedom to choose assignment, and number of years of combined 
training and teaching experience. 

Outcome analyses for pupil measures reveal significant differences 
on the TOAT and reading scores in favor of MFT, The 95 percent confidence 
interval for the WRAT ranges from -14,8 to -.2 units, and for reading, 
from -13,4 to -,02 units. In light of the baseline bias in favor of FT, 
such results are more than indicative of lack of program impact at the 
end of the two-year experience. The low FT per-pupil expenditure could 
be associated with this result. 

Analyses of parent and teacher data indicate FT-favoring trends. 
However, ^the only measure showing significant difference was acceptance 
of method, which showed that the FT teachers were more approving of their 
methods than were NFT teachers. This result is somewhat confusing, since 
'^.vidence that these methods had impact on FT pupils is lacking. 

Analysis of the one-year, Cohort I-K child data (Table 91) shows 
evidence of a progressive aeficit for the FT group between Spring 1970, 
and Spring 1971. In the first year, FT pupils scored lower than NFT 
pupils only on the affect and language measures. Since process data were 
not collected on this sample, we are unable to offer reasons, for thi3 
reversal of outcomes over the two-year period. 

Project SS(e) 

Project SS (e) is a large pro.iect in a large West Coast city. The 
projected per-pupil expenditure of $698 and PAC involvement (1 per 27 
pupils) are considered below average. 

Tai;le 92 summarizes the two-year results for the eight FT and the 
six NFT classrooms included in the project sample. Table 93 summarizes 
•first-year results.. Except that more than four times as many FT pupils 
had preschool experience as NFT pupils, the groups are moderately comp..r- 
able. Baseline test scores were nearly equivalent, and classroom composi- 
tioris were similar. 
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The samples are somewhat less comparable in terms of family variables, 
FT parents were more likely to have a high school education, list unskilled 
occupations, be unemployed, and be poverty eligible than were NFT parents, 
vVliilc both FT and NFT households were characterized by male lieads, the 
proportion was much hi^^her for NFT households. 

First-year resul ts reveal no significant differences in child outcome 
measures. The only significant difference in the second-year outcomes 
for the two groups of children was in attendance. Unlike the primarily 
FT-favoring differences in the first-year results this attejidance differ- 
ence and differences in all other second-year child outcome measures 
favorod the NFT sample. Teacher data were insufficient to support analyses 
for this project, .No parent outcomes reached significance. 

This project failed to demonstrate positive FT impact. Since pix>cess 
data are unavailable, any interpretation would be merely speculative. 



Pro J ec t PI 

This pr(^ject is the only parent-implemented project included in this 
^.iterim report. It is a small project located in a large urban setting 
(population 4 million) in the middle Atlantic region. The anticipated FT 
per-pupil expenditure is below average and the pupil/PAC ratio of 15 is 
near average. 

Two-year data for the Cohort I-K group in this project are summarized 
in Table 94. These four FT and three NFT classrooms appear comparable 
on the basis of pupil n^aasures and classroom composi i:ion. 

Although neither FT nor NFT parents were highly educated and both 

gro .ps were underemployed, the NFT parents appear to be at a much lower 

so -oeconomic level than the FT parents. Almost twice as many NFT parents 

wer poverty eligible. In addition, NFT parents were much more likely to 

be unemployed, and a much higher percentage of NFT households lacked 

xnz (. he ad s . 
) 

On the other hand, teacher data show FT/NFT similarities in terms 
of job satisfaction, ethnicity, residence outside the school community, 
freedom to choose assignment, and training and experience < However, 
• while FT teachers tended to have more helpers in their classrooms, NFT 

teachers tended to have more book resources available, 

) 

Analysis of pupil outcomes fails to indicate significant FT/NFT 
differences, but all differences favor the FT classes. Analyses of parent 

J 
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and teacher variables do not add to the negligible evidence that this 
project had impact, FT parents scored higher than NFT parents on all 
variables except sense of control, but none of the differences is statis^. L- 
cally significant. Furthermore, NFT teachers scored significantly higher 
than FT teachers in their ratings of how essential they considered contact 
with the parents outside the classroom. 

Overall, there is little' evidence of FT impact in this project. Pupil 
and teacher outcomes show negligible., or unfavorable differences. Since 
this project is parent-implemented, the absence of clear parent impacts 
suggests that the project is meeting with little success. 

Analysis of the one-year effects on children in Cohort I-K (see 
Table 95) indicates that FT pupils did iiaprove somewhat from Spring 1970 
to Spring 1971, At the end of the first year, the FT children were higher 
than the NFT children only on the language variable and equal only on the 
cognitive process variable. At the end of two years, the FT group scored 
higher than the NFT group on all variables. However, additional data are 
needed to determine whether significant program impacts are emerging. 
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Part.2 

SUMMARY OF OVERALL INTERIM FOLLOW THROUGH EFFECTS 



Part 2: SUMMARY OF OVERALL INTERIM FOLLOW THROUGH EFFECTS 



The interim FT/NFT effects obtained from the separate cohort analyses 
are sjuminarized in Tables 96 through 99. These tables present adjusted 
FT/NFT differences on all pupil, parent, and teacher outcomes. They also 
include entries for parent satisfaction, teacher job satisfaction, and 
number of classroom helpers. These last three variables are included in 
the t|.bles because they reflect valid program objectives over and above 
differences possibly associated with comparison group problems. Note, 
however , that these ^'effects" are unadj usted , and that they a re used as 
input controls (covariates) in th'e analysis^ of the other program outcomes. 

Table entries with a positive sign indicate differences favoring FT ; 
those with negative signs show differences favoring NFT. Those differences 
reaching significance (p < .05) are flagged with an asterisk. Overall cohort 
averages (computed by summing across projects) are presented at the bottom 
of each table. These overall, or ^'average," cohort values represent the 
mean FT/kFT difference for each outcome. The row marked "Percent FT 
Favoring^' at the bottom of each table shows the percentage of projects 
reporting FT-favoring differences for each outcome variable. 

Because the interpretation of outcome effects is moderated by the 
comparability of FT and NFT samples, the tables include a designation for 
each FT/NFT comparison as a ^^good,^' ^^moderate , ^' or ^^poor^' match. These 
designations were derived by inspecting seven of the demographic baseline 
variables : 



(1) 


Percentage 


of 


students with preschool experience. 


(2) 


Percentage 


of 


parents without high school diplomas. 


(3) 


Percentage 


of 


parents in skilled occupations. 


(4) 


Percentage 


of 


Black parents . 


(5) 


Percentage 


of 


parents who are poverty eligible. 


(6) 


Percentage 


of 


heads of household currently employed 


(7) 


Percentage 


of 


heads of household who are male. 



* 

The exception to this rule is the attendance variable; fewer absences 
for FT is represented by a minus sign. 
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For each project, the number of those variables showing a FT/^FT differ- 
ence of 10 percentage points or more was tabulated,.' Three or less discrep 
ancies of 10 percent or more resultad in tlie classification of an FT/i^FT 
comparison as a "good" match. Four or five discrepancies of 10 percent 
or more resulted in a "moderate" match classification, and six or seven 
discrepancies of 10 percent resulted in a classification of "poor." 

^Since these labels are somewhat arbitrary, they should not be taken 
literally. One could reasonably argue that a good match is one in whi^h 
FT and NFT differ on none of the demographic variables by more than 10 per 
cent, but such a condition is virtually nonexistent in the pres'^.it compari 
bon . However, our classification scheme does provide useful information, 
and it is discussed later in this section. 



Discussion of Summary Tables 

Cohort I , Kindergarten : Fall I' ^Z^ to Spring 1971 

The second-year outcomes for Cohort I-K are summarized in Table 96. 
These results are mixed, with little consistent evidence of FT impact on 
the child outcome variables. Inspection of the average FT/NFT difference 
summed across projects shows FT-favoring differences on the achievement 
measure, the attendance measure (negative signs indicate less absentee- 
ism), and the quantitative and cognitive processes measures. The remain- 
ing measures favor NFT All of these differences are small and not 
especially noteworthy. 

The parent outcome measures are also not especially noteworthy, ex- 
cept for the parent/school involvement variable, for which 81 percent of 
the projects display FT-favoring results. This suggests that FT is having 
an impact on the degree to which parents become involved in school- related 
activities. (One must keep in mind that these measures are obtained 
during the child's first year of FT participation. Thus, parent outcomes 

are f i rst-year outcomes , regardless of the cohort or grade stream) . 
I 

I 

For Cohort I teacher results, three of the four measures display 
strong results.* Seventy-f ive percent of the projects display FT-favoring 
differences for both the teacher acceptance of methods and the number of 

I 



The teacher satisfaction variable and the number of classroom helpers 
variable are represented by unadjusted (i.^., not subject to covariable 
adjustment) outcome measures. 
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classroom helpers available to the teacher. On the other hand, only 10 
percent of the projects display FT-favoring differences for the teacher's 
image of the parent as an educator (i.e., in 90 percent of the projects, 
NFT teachers considered at least some kinds of involvement with parents 
in the educational process more "essential" than did the FT teachers). 
This outcome is difficult to interpret without more information. At 
first, it suggests that for this cohorl:, a high degree of parental involve- 
ment has not been favorably received by many FT teachers. It may also 
mean that FT teachers, having had extensive contact with parents, view 
the parental role as supportive rather than - essential . 

C ohort I, Kindergarten: Fall 1969 to Spring 197 0 

The one-year effects for Cohort I-K are also summarized in Table 95. 
The evidence o/J program impact is slightly more encouraging in these data. 
FT-favoring outcomes are noted in 75 percent of the projects for the 
quantitative raeasure and in 83 percent of the projects for the cognitive 
processes variable. However, the average FT-fav6ring difference per proj- 
ect for these variables is relatively small. The results for the remain- 
ing variables are not particularly noteworthy. 

J 

Table 96 also permits comparison of one-year and two-year summary 
effects of Cohort I-K. The data row labeled "Two-Year Outcomes for Proj- 
ects with One-Year Data" presents the average Ft/nFT second ^»-year difference 
scores for the projects included in the first-year sample (i.e., these proj 
ects represent a subset cif the Cohort I-K two-year effects sample). These 
comparisons suggest some longitudinal impact for FT projects, since the 
percentage of FT-favoring outcomes is generally higher in the second year 
than it is in the first. This trend is also apparent in the average FT 
effect across projects. Ttiis indicates that, with the exception of the 
cognitive processes variable and the language variable, FT children show 
a greater advantage over their NFT counterparts after two years than they 
do after one year. 

Cohort II, Kindergarten: Fall 1970 to Spring 1971 

The summary data for Cohort II-K are presented in Table 97. An 
examination of the cnild outcome variables is moderately encouraging in 
that all of the variables except affect show average differences in favor 
of FT. Tills pattern is also reflected in the percentage of projects re- 
porting FT-favoring outcomes; the percentages range from a low of 50 (for 
the P.ffect variable) to a high of 75 (fpr the achievement, quantitative, 
and reading variables) . 
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Encouraging effects are also present for the parent outcome variables, 
where four of the five measures show overall FT- -favoring differences (the 
parent expectation variable is the exception). For all variables, the 
majority of the projects showed FT-favoring results. 

Teacher outcome data for Cohort II-K were available for only four 
projects. Although the small number of projects precludes interpreting 
the data with confidence, the summary outcomes are generally FT favoring. 

Cohort I, Entering First: Fall 1969 to Spring 1971 

The second- year summary for Cohort I-EF is presented in Table 98. 
The results for the child outcome variables in this table are considerably 
less encouraging than those for Cohort I-K projects. In particular, the 
affect, WRAT, reading, and language measures display average outcomes 
favoring. the NFT sample. The only overall outcomes in favor of FT were 
those for the attendance and quantitative variables. This pattern is 
also displayed by the percentage of FT-favoring results. Only the attend- 
ance and reading variables show a higher number of FT-favoring differences. 

The parent outcome results for this cohort sample show a provocative 
and somewhat paradoxical pattern. Overall results showed FT-favoring 
differences (both average project effects and percentage of FT-favoring 
differences) for the parent-child interactions, the parent-school involve- 
ment, and the sense of control variables. The average project effect was 
NFT favoring for parent expectations; in only 20 percent of the projects 
did FT parents report higher expectations for their child's suc3ess than 
did NFT parents. But, ironically, 80 percent of these same projects 
showed higher proportions of FT parents reporting they were satisfied 
with their child's current progress. This might indicate that FT parents 
in these predominantly Southern rural projects simply have lower overall 
aspirations for their children and, hence, apjpear satisfied with their 
children's current progress. But this interpretation is hard to reconcile 
with the more positive FT results for the school involvement and sense 
of control measures, unless these parents are responding to the expecta- 
tion measures on the basis of a larger socio-cultural context. (Many of 
these projects involved very poor Black F^T families and less poor non-Black 
NFT families) . 

The teacher results for this cohort sample are also difficult to 
interpret straightforwardly. The average project differences show vir- 
tually no FT effects on the teacher's image of the parents, and FT teachers 
appear, on the average, no more or less satisfied with their working con- 
ditions than do WT teachers. But literally ev^ry project shows FT 

/ 
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teachers aie more approving of their methods than NFT teachers, and in 
only or.e project did NFT teachers report more classroom helpers than did 
FT teachers . This pattern of outcomes suggests FT is well regarded by 
these teachers, perhaps because they receive more classroom assistance. 
The interesting result is that these FT teachers are apparently less 
negative (compared with NFT teachers) in their view of the parent as an 
educator than were the FT teachers in the Cohort I-K and II-K samples. 
Whether this reflects better parent- teacher relationships, indifference, 
or something else is unclear at this point in the evaluation. 

Cohort I, Entering First; Fall 1969 to Spring 1970 

The availabl3 first— year child outcome data for Cohort I— EF, also 
presented in Tabla 98, car be directly compared with sacond-year data. 
The number of projects for which both one- and two-year results exist is 
small (seven in all), which necessarily limits the confidence that can 
be placed on the interpretation. The pattern of first-year results is 
essentially the same as that noted for the second-year data. Most of the 
variables display NFT-favoring trends (the exceptions are the attendance 
and affective measures) for both the average difference measure and the 
percentage o FT-favoring projects measure. 

The comparison of these Cohort I-EF first-year and second-year effect 
reveals some interesting differences over those displayed by the Cohort 
I-K sample. Specifically', second-year effects for Cohort I-K were more 
favorable than first-year effects, suggesting a cumulative positive impact 
for FT. The opposite is true for the present Cohort I-EF p^'cjects, where 
there is evidence of a progressive dr :rement ; second-year outcomes more 
frequently reflect NFT-favoring trends than do the results for these same 
childr^^^n after their first ^, ear in the program. But since the NFT samples 
represent a different population than the FT samples in many instances, 
this result is likely due to inappropriate comparisons and thus is not 
interpretible . 

Cohort II, Entering First; Fall 1970 to Spring 1971 

The* results for the Cohort II-EF projects are summarized in Table 
99. Since only four projects are included in this sample, summary sta- 
tistics are likely to be unreliable. The results presented Indicate a 
favorable impact for the FT program. With the exception of the a-^tend- 
ance Variable, all of the child outcome variables show FT-favoring dif- 
ferences. A similar pattern is observed for the parent outcome variables, 
where FT-favoring trends are present for all variables. The teacher out- 
come variables also display favorable trends, with the exception of the 
parent ima^e measure. 
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The Parent Image Variable 



The overall trend toward negative FT results for the teacher's image 
of parents as educators deserves some special attention. NFT teachers 
who have not had muoh interaction with parents may have an exaggerated 
notion of the importance of such contacts, and FT teuchers may be making 
more informed judgments. Perhaps contact with parents o\:tside of the 
school context is less important to FT teachers because they have more 
in-school interaction and because they feel that further contact with 
parents is unnecessary. Or perhaps FT parent- teacher interactions have, 
in fact, entrendered resentments. The difficulty is that we da not know 
at present why the teachers responded the way they did to the items as 
presented on the questionnaire, and thus, we prefer not to draw inter- 
pretative conclusions at this point in the study. Nevertheless, the pat- 
tern is clear and consistent; NFT teachers reliably tended to rate parents 
and parent contacts outside of class as more essential to the child ^s 
education than did FT teachers. 

Discussion of S ample Matching 

\ ; ■ — 

The classification scheme used to index sample comparability was 
devised to assist interpretation of the outcomes associated with the FT 
projects. As described earlier, this scheme provides a basis for class- 
ifying each FT/NFT comparison within projects as a ^'good',' ^'moderate , ^' 
or *'poor" match. The frequencies with which projects were classified 
into ^'good,^' ^'moderate, ^ and ^^poor*' match categories relative to each 
set of outcome analyses are summarized in Table 100. This table shows 
that of 28 Cohort I-K projects for which two-year child outcomes were 
analyzed, 12 were classified as having reasonably 7sood^' baseline com- 
parability, 12 as ^'moderate^' and 4 as ^*poor." Similarly, for the 26 Co-- 
hort I-K projects for which parent outcomes were analyzed, 11 had *'good" 
matches, 12 had ^'moderate," and 3 had ^'poor.** 

Using this classification scheme and summing across all such entries, 
roughly 41 percent of the outcome analyses involved *'good^' FT/OTT matches, 
48 percent involved ^^moderate" matches, and 11 percent involved "poor^^ 
matches. But two important features of this procedure need to be stressed. 
First, t!^,' classification scheme is arbitrary, although we believe it is 
reasonal u.e and objective. Second, classroom composition and pupil famil- 
ies are the match variables , whereas the match classification is applied 
to pupil, parent, and teacher outcome analyses. Hence, it would not be 
unusual if teacher outcomes were unrelated to the quality of pupil/parent 
matches . 
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TABLE IOC ' 

FREQUENCY OF "GOOD," "MODERATE/' AND "POOR" FT/NFT 
MATCHES, BASED ON BASELINE DIFFERENCES 
FOR THE SEVEN MATCH VARIABLES 



OUTCCME MATCH CATEGORY 

COHORT SAMPLE MEASURES GOOD MODERATE POOR TOTAL 



I-K, TWO YEAR 


CHILD 


12 


12 


4 


28 




PARENT 


11 


12 


3 


26 




TEACHER 


10 


9 


1 


20 


I-K, ONE YEAR 


CHILD 


6 


4 


2 


12 


il-K, ONE YEAR 


CHILD 


5 


3 


0 


8 




PARENT 


4 


3 


0 


7 




TEACHER 


2 


2 


0 


4 


I-EF, TV/0 YEAR 


CHILD 


2 


i 

7 


2 


11 




PARENT 


2 


7 


1 


10 




TEACHER 


1 


6 


1 


8 


I-EF, ONE YEAR 


CHILD 


1 


5 


1 


7 


I I-EF, ONE YEAR 


CHILD 


2 


1 


1 


4 




PARENT 


2 


1 


0 


3 




TEACHER 


2 


1 


1 


4 


TOTAL 




62 


73 


17 


152 


PERCENT 




40.8% 


48,0% 


11 < 2% 


100 



V i 

With these cautions in mind, several interesting trends can be ob- 
served in Table 101, which displays the distribution of ]T-favoripg re- 
sults for the independent oiitcome variables as a functioJi of match cate- 
gory. These frequencies are tabulated within each of the four cohort 
groupings for each of the six separate (nonoverlapping) adjusted pupil 
outcome measures, for the four adjusted and one unadjusted parent variables, 
and for the two adjusted and two unadjusted teacher variables. Hence, for 
Cohort I-K project child outcomes, 5 of the 12 ''good" matched projects, 
7 of the 1^ ^'moderate" matched projects, and 3 of the 4 "poor" matched 
projects showed FT-favoring results on the affect measure. Similarly, 8 
of the 12 "good" matched projects yielded FT-favoring differences for 
attendance, and so forth. 
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The entries at the bottom of each measurement c 'tegory in Table 101 
show the overall frequency and percent of FT-favoring results for each sepa- 
rate 'match classification. Thus, for Cohort I-K, 61 percent of the indepen- 
dent child outcomes among "good^' matched projects favored 1/ 56 percent of 
the **moderate^' matches favored FT, and 46 percent of the "poor^' matches 
favored FT. This trend is even more pronounced for Cohort I-EF, whers 
the values are 80, 49, and 20 percent, respectively. The trend is less 
clear for Cohort II-K and II-EF, although interpretation is difficult be- 
cause there were no "poor^' match classifications for Cohort II-K and be- 
cause only four projects were analyzed for Cohort II-EF. Yet the Cohort I 
pattern does appear to suggest that outcomes favor FT more strongly when 
comparability is good. Recall that FT populations tend to be more dis- 



are made between similar populations, FT-favoring pupil outcomes are likely. 

The relationship of outcomes to quality of match becomes even more 
pronounced with parent data. This result is important because six of the 
seven match variables are parent characteristics. Although this evidence 
of a strong relationship between the proportion of FT-favoring results 
and the comparability of FT/NFT families is probably due to the inappropri- 
ateness of the covariance model (or any analysis model) when bias becomes 
great, it is not immediately clear why NTT is favored. A further break- 
down of the nature of these extreme biases in the match between FT and ' 
NFT suggests an answer. In some three- fourths (76 percent) of the in- 
stances of extreme noncomparability (i.e., poor matches between FT and 
NFT), NFT was markedly less disadvantaged than FT. In contrast, barely 
more than half (56 percent) of the* moderate mismatches between FT and NFT 
showed FT to be more disadvantaged than NFT. Hence, it is clear that these 
results are seriously dependent on both the magnitude and direction of 
initial biases and that covariance adjustments are insufficient to over- 
come these biases with these interim child and parent data. 

We feel this result is very important, since it typifies the extra- 
ordinarily complex problem of obtaining valid assessments of individual 
project results. Figure 5 graphically displays the relationship of the 
pupil outcome trends to the degree of initial bias in the FT and comparison 
samples. These patterns are displayed separately for the K entrance (Part 
A) and First Grade entrance (Part B) project samples. Several features of 
the interim pupil results become evident in this display. First, the 
relationship between outcomes and match bias is unmistakable, even though 
the match variables were included as covariates in the analyses of the 
data. Second, this relationship appears more pronounced for EF than for 
K samples, which corresponds to our previously noted observation that 
match problems tend to be more severe for EF than K projects. Third, 
these match bias problems notwithstanding, the frequency of results reach- 
ing significance is clearly in favor cf FT, whereas the rate of significant 



advantaged than 




when the FT/NFT comparisons 
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differences for the NFT groups is generally at or below the *'experiment- 
wise'* error rate (i.e., the overall alpha level). Finally in our judgment 
these trends provide sufficient evidence for deleting project samples 
judged as poorly matched with their comparisons from our summary assess- 
ments of interim program impacts on pupil and parent measures, since in 
these cases the initial mismatch apparently dominates the data. 

Restricting our tabulations of outcomes (Table 101) to those project 
samples judged either well or moderately matched on the seven match vari- 
ables (classroom compos it ion and f amily char acter is tics ) several interest- 
ing trends emerge. Figure 6 shows that for the 33 Cohort I (good and 
moderate matched) projects about 58% of the adjusted mean differences on 
pupil variables showed FT as having a positive impact, whereas for the 
11 Cohort II projects, 67% of the outcomes were FT favoring. This upward 
trend in the proportion of FT favoring results (especially the proportion 
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of differences reaching significance) from Cohort I to Cohort II demon- 
strates an apparent program maturation phenomenon. That is, the impact 
of the program on successive cohorts appears to be getting stronger. 
This could mean that as sponsors, teachers, and other Stakeholders gain 
experierice with FT-like procedures, they become more effective imple- 
menters of the program. Also involved in this apparent effect, however, 
are improvements in evaluation methodology. Cohort II baseline measures 
were gathered substantially closer to the commencement of school and data 
collection procedures were more systematic, complete, and error-free than 
was the cost for Cohort I,* 

However » the interpretation of a FT maturation or implementation 
effect is further supported upon comparison of program impact on successive 
cohort parent and teacher samples. Cohort I to Cohort II outcome (FT-NFT 
difference) trends on the five parent measures (reported child interac- 
tions , school involvement, satisfaction with progress, future expecta- 
tions, and sense of control) are summarized in Figure 7, These re- 
sults again summate results across only those projects for which NFT 
matches were categorized as good or moderate. Thus, of the 32 Cohort I 
projects, nearly 617o of the results showed positive FT impact, whereas 
for the 10 Cohort II projects, 72% of the results were FT favoring. This 
trend is further supported by a parallel trend in the proportion of dif- 
ferences reaching significance, ? 



Teacher outcome data f (Figure 8) show even stronger implementation or 
program maturational trends. These results which are summed acros3 all 
projectst show the proportion &f FT or NFT favoring (and proportion- 
significant) differences on the four teacher variables: Parent educator 
image, professional acceptance of method, job satisfaction and adult 
assistance. For the 28 Cohort I teacher samples, approximately 60% of 
all adjusted mean differences on these measures were FT favoring, whereas 
for the 8 projects represented by the Cohort II teacher data^ over 78% 
of all differences favored FT, Thus it is difficult to escape the 



Recall, Cohort I baseline pupil data were collected as late as December, 
1969, whereas Cohort II pretest measures were gathered in September and 
October, 1970, And since baseline differences were included as covariates, 
CI FT programs M^^ich produced impacts in the first lew months of school 
would be (invalidly) penalized, or at least underestimated. 

Since match categories were constructed independent of teacher consid- 
eration, there is no a priori basis for excluding teacher c<Hnparison 
data on projects mismatched with respect to pupil/family characteristics. 
In fact, the overall FT favoring outcome rates were 70%, 60%, 83% for 
good, moderate and poor matches, respectively. 
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interpretation that FT is having a stronger and more positive impact on 
successive samples of parents and teachers^ as measured by the current 
evaluation variables. 

In summary, given the provocative relationships of outcomes to initial 
FT/comparison group equivalence as demonstrated above, added to the multi- 
plicity of design and methodological factors which combine to produce ex- 
tremely low statistical power for detection of differences as significant, 
it seems preferable to interpret these data patterns in the context of 
the overall, as opposed to "significant only" results, noting carefully 
each of the major caveats associated with Interpretations at this time. 
We feel there is evidence of an emerging program impact, which is increas- 
ing in magnitude as the program matures. These trends are revealed by 
inspection of pupil, parent, and teacher outcomes for successive cohort 
samples, controlling for FT/NFT comparability. The extent of project and 
sponsor variation in conjunction with problems of sample size, measurement 
validity, and comparison group bias makes these conclusions highly 
speculative. However, one could expect little more from only a glimpse 
of emerging effects representing a limited sami-le of observations made 
less thRn one-fourth of the way through a planned national longitudinal 
experiment. To try to infer or conclude more than this now would be 
equivalent to attempting to describe the results of a race based on the 
relative positions of jUst a few participants at a point less than one- 
quarter of the distance to the goal. Such conjecture is neither respon- 
sible nor in the interest of the evaluation. 
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PART 3: OBSERVATIONAL EVIDENCE OF PLANNED VARIATIONS 



Backgroun d 

A central purpose of the national Follow Through | Program was to 
encourage thje development and refinement of improved approaches to early 
education. I Each of the sponsored Follow Through approaches is intended 
to offer an alternative to the kinds of school experiences traditionally 
encountered by poor children during their primary school years. 

In the interim evaluation, this concept of "planned variation" rep- 
resents the "treatment" whose impacts are being evaluated. As shown in 
preceding sections, some sponsors emphasize changes in parent-school 
communication patterns, others cc^ncentrate on influencing parent-child 
interactions, and nearly all attempt to affect teacher-child interactions 
in the classroom. The parent and teacher outcome measures obtained during 
the first year of the children's FT experience can be considered as evi- 
dence of the implementation of treatment. A separate study of community 
factors is designed to assess changes in parent-school relations, which 
can, in turn, be considered as important mediators of child outcomes. 

This interim evaluation does not purport to test a definitive model 
of factors causally related to child achievement and affect. Moreover, 
an assessment of all possible determinants of child outcomes is beyond 
the scope of this evaluation. In fact, many such possible determinants 
may still be latent now, and their detection will be dependent on sub- 
sequent assessments . 

To the extent that we can observe reliable components that system- 
atically occur in classrooms within a planned variation, we have some 
evidence as to what comprises these treatments. Moreover, it is through 
^ these classroom observation data that we have our best evidence that 
treatments, as such, even existed. The existence of evidence of imple- 
mentation of the various FT treatments contrasts with a lack of documen- 
tation of treatment in several other compensatory education programs 
(Wargo, 1972). We present here evidence that FT sponsors have, in fact, 
implemented treatments at the classroom level (as administered by 
teachers) that are discernibly different from activities observed in 
comparison schools or in other FT-sponsored schools. 
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Evidence of Differential Treatments 



Evidence used to assess and to evaluate the occurrence of systematic 
classroom components comes from data obtained for the classroom obser- 
vation sample of 1970 to 1971. This sample, summarized in Table 102, 
consisted of 123 classrooms, distributed across 2 grade strc^-'is and 2 
cohorts, in 17 different project.^. 

I 



TABLE 102 

DISTRIBUTION OF CLASSROOMS IN THE CLASSROOM 
OBSERVATION SAMPLE 



COHORT I 



COHORT II 



NON- ENTERING SECOND ENTERINvG 
GRADE STREAM GROUP FIRST GRADE GRADE KINDERGARTEN FIRST GRADE PROJECTS 



KINDERGARTEN FT 
NFT 



33 
8 



33 



12 



ENTERING 
FIRST 

TOTAL 



FT 
NFT 



41 



14 
19 



41 



17 
__5 

22 



__5 
17 



A few classrooms in which observations occurred were not iticluded 
in classroom testing in Spring 1971. Thus, the observation sample is 
not fully nested within the set of projects and classrooms for which 
analyses of test data were possible. The overlap is substantial, however, 
and in all jprojects in which both tef-ting and classroom observation took 
place, process data obtained from the observations are included in the 
tables of outcomes. 

The evidence to date is not definitive regarding the existence of 
approaches that ditffer sys temat ically from orije another and are al so con- 
sistent f rom place to place with in an approach . Tjiese datk do strongly 
suggest, however, that there are some reliable differences between sponsors 
j::nd that, for many proc,ess variables, there is reasonable consistency from 
classroom to classroom across projects within a sponsored approach. 
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To summarize the outcome tables, we have displaced factor scores 
derived from the observations. These factor scores provide a qualitative 
indication of process characteristics within those projects where obser- 
vations were conducted. Informal analyses of factor score profiles sug- 
gest substantial homogeneity across classrooms wit'iin sponsor categories 
for some factors and considerable heterogeneity across classrooms within 
sponsor categories for other factors. 

For seven of the nine sponsors included in the sample, observations 
occurred in two different projects, thus enabling analysis of interproject 
process consistency. Before summarizing this examination, it is important 
to emphasize that sponsor and district or project influences are confounded 
i.e., to the extent that process regularitiiBS occurring in obsei^ved class- 
rooms are more a function of district influences than sponsor influences, 
our interpretations of those regularities a? treatments will be in error. 

Despite these interpretive caveats, visual examination of the factor 
profiles supports two interpretations: 

(1) Reasonable similarity exists betweon project^ within sponsor 
categories on some factors. 

(2) The two factors for which there is the least apparent inter- 
site difference within sponsors are also the factors that 
reveal the most consistent intersponsor differences, 

Tabl^ 103 summarizes the results of judgmental interpretation of fac- 
tor scores across classrooms within and between projects for each of the 
seven sponsor categories in which observation occurred in two different 
projects. This table shows that for the self-regulatory and programmed 
academic factors, apparent differences between groups of classrooms in two 
different projects were small for nearly all sponsors. In contrast, most 
of the seven sponsors show higher variability across two groups of class- 
rooms for the remaining three factors. The sponsors differed somewhat on 
their apparent consistency for certain factors. For example, while the 
self-re ;ulatory and programmed academic factors appeared to differentiate 
reasonably well among the set of seven sponsors, additional factors may 
also be characteristic of a particular sponsor. For example, FW projects 
showed reasrjnable similarity between two project locations on the self- 
regulatory factor, the programmed academic factor, and the child self- j 
learning factor; and UA projects showed reasonable similarity on the 
self-regulatory factor, the child-initiated interaction factor, and the 
programmed academic factor. 
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TABLE 103 

INTERPROJECT DIFFERENCES WITHIN SPONSOR 
CATEGORIES FOP FACTOR SCORES 



FACTORS 

SELF- CHILD-INITIATED PROGRAMMED CHILD 

SPONSOR REGULATORY INTERACTIONS ACADEMIC EXPRESSIVE SELF-LEARNING 



FV\f _ * _ * 

UA - - - * * 

BC - * - - * 

UO - * - * * 

UK - * _ * * 

HS - * * * _ 

"•I 

UF * - _ - * 



interproject differences within sponsor categories, 
interproject differences. within sponsor categories. ( 

I 

As noted above , the two factors for which sponsors tended to dis- 
play the most interproject consistency were the self -regulatory factor 
and the programmed academic factor. Figures 9 and 10 show sponsor average 
factor scores (i.e., averaged across classrooms) for the self-regulatory 
and the programmed academic dimensions, respectively. It is difficult 
to avoid an impression of intersponsor difference from these graphs. 

Analyses of variance were subsequently performed on each of the 
41 discrete variables that served as jnput to the factor analyses. These 
further analyses were of two forms: (a) comparison of average FT versus 



= Apparently large 
= Apparently small 



ERIC 



Discussion in this report is restricted to the 41 variables used in 
the factor analysis. Additional details, as well as examination of 
nearly 30 additional variables , are presented in the SRI report, **Follow 
Through Classroom Observations" (1972b, Appendix B) . 
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NFT values averaged across all FT/NFT classrooms, and (b) intersponsor 
comparisons of means. The first analysis served to point up process 
components that differentiate FT from NFT. The latter analyses provide 
a basis for qualitative validation of planned variations on a variable- 
by- variable basi s . 

Results of the aero ss-sponsor comparisons reveal consistent and 
significant FT/NFT differences for all process variables directly related 
to the number of adults involved in classroom activities and for a number 
of other variables that are probably (but not certainly) related to adult- 
child ratios. Only one process variable that may be independent of adult- 
child ratios displayed significant FT/NFT differences. This pattern of 
results is shown in Table 104. 

In Table 104, variables that significantly differentiate between 
FT and NFT classrooms are grouped in terms of apparent relationship to 
the number of adults in the classroom. These variables are listed in 
terms of the direction (FT or NFT) of the difference. 



Since the evidence clearly indicates that FT/NFT treatment differences 
are strongly dependent on adult participation in the classroom, a tabu- 
lation of adult-pupil ratios observed in these classrooms was made. The 
results of this tabulation, displayed below, show how different the two 
groups really are on this variable. 



Ratio 



Follow Through 
(N = 97) 



Non-Follow Through 
(N = 26) 



Maximum 
Average 
Minimum 



1 to 3 
1 to 6.8 
1 to 14 



1 to 7 
1 to 12.8 
1 to 33 



We feel that this result is important because a favorable adult-child 
ratio is a necessary condition for the implementation of many critical 
features (or treatment components) of the planned variations. Evidence 
for the implementation of planned variations at a more detailed level than 
sheer adult-child ratio is available. Sponsors report preservice and 
inservice training for aides as well as for teachers. Observation data 
reported more fully elsewhere indicate that not only are there more aides 
in FT, but also that they function differently in the academic progran.. 
in the classroom than do NFT aides. 



303 



Table 104 



PROCESS VARIABLES THAT DIFFEIiENTIATE* BETWEEN FT 
AND NFT CLASSROOMS 



Relation of Process 
to Adult-to-Child Ratio 
in the Classroom 



Character is tic Processes 



Follow Through 



Non-Follow Through 



Almost certainly directly 
related to number of 
adul ts 



Probably related to num- 
ber of adults 



Adult with one or two 
children in academic 
ac tivi t ies 

Adult with small 
groups of children in 
academic activities 

Aides participating in 
academi c act ivi ties 

Adult communication 
focus on small groups 
of children 

Arts , crafts , sewing , 
cooking , pounding ac- 
tivities 

Blocks, trucks, dolls, 
dress -up activities 

Guessing games , table 
games , puzzles 

Independent child ac- 
tivities 

Snacks , lunch , group 
time 

Wide variety of ac- 
tivities 



Adult communication focus 
on large groups of chil- 
dren 

Adult informing children 



May not be rel ated to 
number of adults 



Academic activities 



Differences between FT classrooms (N = 97) and NFT classrooms (N = 26) were 
statistically significant (p < .05). 
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The results of the intersponsor analyses for process variables are 
summarized in Table 105, which presents the rank order of sponsor mean 
values for each of the 41 variables analyzed. The aggregate NFT value 
is represented by an arrow in each such array. The ^^e^vman-Kuels pro- 
cedure for multiple comparisons was employed to test for differences 
among all possible ordered pairs within each array. The results of these 
tests are represented by the underscores. Within an array, any two sponsors 
not underscored by a common line differ significantly at p < .05. For 
example, in Table 105, Variable 1, the UO mean, is significantly different 
from the HS , UF , NY, BC, and F\V means, respectively (not underscored by 
the same line) but not significantly different from UK, ED, or UA (under- 
scored by the same line). Similarly, the UA mean differs significantly 
only from the BC and FW means on Variable 1. Finally, the ED and UK means 
are not significantly different from any other sponsors (i.e., they share 
at least one common underscore with all other means) on this variable. 

Obviously, Table 105 contains a tremendous amount of information. 
Rather than attempt a laborious interpretation of the results for eacli 
of the 41 variables, we will discuss tiie results foi just a few key vari- 
ables. For example, Variables 8 (small group instruction), 11 (wide vari- 
ety of activities), and 13 (participation of an aide) all clearly show 
striking FT/NFT differences; virtually all sponsors average above the 
pooled NFT. These results are interpreted as clear evidence of discern- 
fble FT treatments that exist at the classroom level and that involve 
individualized or small group approaches and wide varieties of activities 
(such as games and puzzles--see Variable 5). 

Inspection of differences for several interaction variables (14 through 
41) further validates the existence of several of the more highly defined 
planned variations. For example, all sponsors are below NFT on the rel- 
ative occurrence of teacher didactic behaviors (Variable 19), and yet, the 
rank order of sponsors on this variable corresponds quite well to that 
predicted on the basis of the model descriptions (i.e., UF, UK, UO are 
among the highest and FW^ NY, HS, and ED are among the lowest). Similarly, 
both UK and UO are highest on the relative frequency of teacher praise 
(reinforcement — Variable 32), whereas ED is lowest and NFT is in the mid- 
dle. FW — the "responsive environment" model — averages highest on Vari- 
able 33 (adult positive corrective feedback) . 

This method of interpretation of results is highly qualitiative and 
subjective. However, in the absence of rigorous evidence of both the 
quality and extent of program implementation, and in the absence of uni- 
tary criterion measures, we feel these data are consistent with many of 
the key descriptive features of the model. Pending further, more intensive 
analyses on larger and more representative data, we offer the following 
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TABLE 105 



OBSERVED SPONSOR DIFFERENCES KOR CERTAIN 
CLASSROOM PROCESS VARIABLES 



SPONSOR MEAN* 

PROCESS VARI/\BLES LOW TUGII 



Activities (CCL) 

1, A-Lunch, snack UO , UA, ED, UK , HS , UF, NY, BC, FW 

T ~ 

2, B-Croup time, story, singing , 

dancing UO , UK, NY.^UF, BC. HS. UA. FW, ED 

3, C-Arithmetic , numbers, math., 
reading, alphabet, language 

development ED, UF, BC, HS, FW NY , UA, UK, UO 

T 

4, D-Social studies , geography, 

science, natural world UO, UF, UK,NY, LC,jJA, FW, HS, ED 

5 , E -Came s , puzz 1 e s UO.^UF, UK, UA, HS, BC, NY, ED, FW 

6, F-Arts, crafts, sewing, 

cooking, pounding, sawing UF, UK, NY, UO, HS , BC , FW, UA, ED 

T 

7, G-Blocks, trucks, dolls, 

dress-up UO , UF NY , UK , HS . UA , FW , BC , ED 



1= 



8. Adult with small groups in 
academic activities ^ED,UF,FW,BC,NY , HS, UA, UK, UO 

9, Academic activities ED, UF„HS , BC , FW, NY , UA, UK UO 

10. Independent child activities UF,UO, HS, UK , BC, UA,NY , EP,FW 

T 

11. Wide variety of activities U F,^UO, NY,BC, HS , UK, UA, FW, ED 

12. Adult with one or two 

children in all activities HS , NY , UO, UF, UA, UK , BC. FW, ED 



T 



13. Aide participating in 
academic activities ED, NY,BC, HS , UF, FW , UA, UO, UK 

Interactions (FMO) f ~ 

14. Adult informing child 

symbolically NY, UA, UO„HS ,FW, UF, BC , UK, ED 

T 

15. Adult asking child a direct 

question FW, UF , BC, UK, 15D„HS , NY, UA, UO 



16. Direct question followed by 

child response FW, UF, BC, UK, ED, HS,. NY, UA, UO 



4= 



Adult praise and corrective 

feedback UF, ED, UA, FW, nY,.BC, HS,UK, UO 



18. Child response followed by 
adult feedback ED, UK, UA FW,NY,BC ,UF, HS,UO 

T 

19. Adult informing child FW, NY , HS , ED , BC , UA , UF , UK , UO^ 

20. Adult asking child thought- 
provoking questions UO , NY, ED,^W, UK, BC , UF , UA, HS 



f = Mean for aggregate of NFT classrooms observed (N = 26). 

Sponsor means on each variable are ordered in increasing magnitude. 
Underlining indicates subsets of no significant difference (p <.05) 
between sponsors, as determined by the multiple range test, Newmaii- 
Kuels method. 
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TABLE 105 (Concluded) 



SPONSOR MEAN 



PROCESS VARIABLES LOW HIGH 



21. Adult informing child with 

concrete objects UO,NY, ED BC,UA,UF,FW,HS,UK 

T 

22. Adult acknowledgment ED, UK ; FW , UF , UA , UO BC , NY , US 

. ^ 

23. Child informing himself with 

objects UK.^FW, BC, UF , UA, ED, HS , NY , UO 

24. All child's self-learning NY, UF , UA , FW , BC.^ED, HS , UK , UO 

25. Child informing another child KD, UO, UF , NY , UK, BC, FW, HS , UA 

T 

26. Child informing himself 

symbolically NY , UA , UF FW , BC , ED , HS , UO , UK 

27. Child questioning adult UK, UF , UO, BC, IlD, NY, . HS , FW , UA 

T 

28. Child's self-expression UO, UK , ED, HS, UF , B C, UA, FW, NY 

T 

29. Adult communication focus — 

one or two children ED, NY, UF „FW, HS, UO, UA, BC, UK 



30. Adult communication focus- 



small group ^F W,BC,UA,NY,UF,HS,UK,ED ,UO 



31, Adult communication focus — 

large group UO,HS, NY,UK. UA, FW, BC , ED, UF, 



32, Adult praise/acknowledgment 

of child ED. UA, FW, UF , BC, NY , HS , UK, UO 



■1^ 



33. Adult positive corrective 

feedback UF, ED, HS, NY„UA, BC, UK, UO, FW 



34, Adult negative corrective 

feedback UF, UO , ED, UK , FW, NY,^HS , UA , BC 

T 

35, Adult negative affect UO, FW , NY, UA, ED, UK,^HS , BC, UF 

36, Child negative affect UO, ED, UF.^BC, UA , NY, FW, UK , HS 

4-— — - — 

37, All negative affects UO, ED, UF, BC, UA, NY, UK , FW , HS 

T 

38, Adult to child positive 

affect NY, BC, Up , ED, UK , UA , UF , H S , FW 

* =— 

33. Child to adult positive 

affect U O , UF , NY , ED, BC , UK , UA , FW , HS 

T 

40, All positive affect ED, NY, BC, UK, UO, UF,^UA, HS, FW 

41. Child positive affect ED,UF, UO,BC , HS,m\UK, UA ,FW 

T 



307 



evaluation of FT treatments, based on classroom observation evidence: 



(1) Sponsored approaches do discernibly differ from one another 
for many process variables. 

(2) Processes characteristic of various FT approaches predict- 
ably depart from charactci istics observed in NFT classrooms 
for many process variables. , 

(3) Analysis of factor scores and discrete variable scores 
presents strong evidence of instructional activities and 
components that correspond well with descriptions of 
intended approaches, thus validating in part the concept 
of planned variations in FT treatments. 
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Section V 

/ 

DISCUSSION AND RECOMMENDATIONS 

/ 

j 

The discussions and recommendations presented in this section are 
based on our experiences in dealing with the intricate and often difficult 
problems encountered in this massive and unprecedented evaluation of a 
nationwide intervention program. We have attempted to focus these dis- 
cussions on issues pertinent to an evaluation of programs like Follow 
Through and have tried to remain objective throughout. The section is 
divided into the following three parts: 

(1) Discussion of the interim results and recommendations for 
improving the evaluation of Follow Through 

(2) Recommendations concerning evaluations of future education 
and social action programs 

(3) Recommendations for specific policy decisions regarding 
compensatory education programs. 

Discussion of the Interim Results and Recommendation s 
for Improving the Evaluation of Follow Through 

The results described in the preceding section present a complex 
picture of findings. We would like to be able to present a simple and 
concise statement of interim effects, but feel that we cannot do so at 
this time. Many of our interpretive difficulties are due to the problems 
described in the following paragraphs. We feel that any conclusions drawn 
from this interim evaluation must be considered in light of these problems. 

The samples on which these interim results are based are small, 
certainly too small to allow us to isolate approaches that "work" and 
approaches that do not. We can conclude that some changes are taking 
place, but we do not yet know precisely what they are or why they are 
occurring. At a more general level, the parent, teacher, classroom ob- 
se-'vation, and community data indicate that Follow Through is succeeding 
in measurably altering adult attitudes and behajviors in the home, the 
school, and the community. Evidence that these changes in adults are 
having impact on the children is less marked and more variable, but re- 
sults tend to indicate positive effects on FT pupils. It is likely that 
in future analyses on larger and more representative samples, evidence 
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of program impacts on pupil attitudes and achievements will be considerably 
more marked . 

In addition to the limitations imposed by the relatively small in- 
terim evaluation samples, we encountered complex problems of missing data. 
These resulted from high attrition and, occasionally, inadequate baseline 
data. The magnitude of these problems was greater than anticipated at 
the beginning because of the unprecedented nature and scope of this re- 
search program „ And, although we now know how to cope with them, they 
restrict our ability to generalize about findings for Cohort I samples, 
and to a lesser extent about findings for Cohort II samples. 

Since Follow Through is a quasi-experiment , the allocation of treat- 
ments to projects and the allocation of units to treatment or control 
conditions within projects were nonrandom. One consequence of this non- 
randomness was that biases were introduced into the design. The bias 
associated with the allocation of treatments to projects may not be very 
serious. But the nonrandomness within projects (i.e., systematic dif- 
ferences between FT and NFT samples) occasionally has sericus consequences. 
For example, in some projects, treatment and comparison groups were very 
different. Although such differences are bound to occur in pseudo- 
experiments for which control | groups are assembled post hoc, they present 
serious obstacles to the interpretation of outcomes. And where comparison 
group biases are severe , we^ suspect they invalidate the result^:, of analyses 
for the projects affected.' 

These problems (missing data, differences between comparison and 
treatment groups, and too few classrooms per project) combine to produce 
relatively low statistic^al power in our analyses for effects. To some 
extent this outcome was^ expected , since Office of Education and SRI made 
conscious decisions to concentrate data collection efforts at the entry 
grade (K or EF) ana at the exit grade (3) and to devote less effort at 
the intermediate grades. Nevertheless, we are quite likely failing to 
decect many irnportant program impacts at this interim point. 

As suggested above, a substantial number of program impacts are 
evident in our analyses of interim data. Furthermore, we believe that 
the true magnitude of the effects is probably somewhat greater than de- 
tected by our analyses. But it is important to reccj^nize that even if 
the number of significant effects were strikingly greater, we would still 
have difficulty interpreting how or why such results occurred because, 
at present, our current knowledge of the treatments is confined almost 
exclusively to the sponsors* descriptions of them. We do have evidence 
from 1 imi ted subs amp les on some of the character is tics of some processes . 
This qualitative evidence indicates that classroom processes conform to 
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these treatment descriptions, 
we now need clear operatioiial 
is installing and maintaining 



To interpret 
statements of 
a project and 



how and why results occur, 
what a sponsor does when he 
how he does it. 



Finally, because of the comr»''exity and variety of the intervention 
approaches, or treatments, in the FT experiments, it is very likely that 
many of the evaluation measures used were not uniformly appropriate, 
sensitive, or relevant to varied objectives. Many program objectives were 
probably overlooked in our assessments. The technology for evaluating 
large scale social programs is in its infancy. We believe that we have 
contributed substantially to the advancement of this technology through 
our successful and unsuccessful experiences with Evaluation instruments 
and procedures , Yet there remains much more to be learned , 

{ 

In sum, the data available for this interim evaluat ion were sampled 
from a limited set of projects and are not adequate for comparing the 
effectiveness of diffeient program approaches. Some rather serious prob- 
lems with baseline data, comparability of FT and NFT groups, and general 
attrition further hampered analyses. That quite a few significant effects 
e-nerged (beyond an "experiment-wise" error rate) in spite of these data 
analysis problems is certainly noteworthy. 

When the pupil results are reviewed within the perspective of the 
overall evaluation design, the likelihood of obtaining a significant 
effect appears to be associated v;ith several rather crucial evaluation 
parameters. In particular, the magnitude and frequency of FT-favoring 
pupil results appear related to: 

• The relative comparability of families in the FT ana NFT 
samples within a project (quality of match). That is, as 
the quality of the match improves, the frequency and pro- 
portion of FT-favoring results also tend to improve. That 
bad matches tended t'^ result in NFT-iavoring results is 
i^rimarily because the initial biases were extreme in favor 
of NFT, often suggesting that two separate populations were 
be ing corrpared , I 

• The severity of impoverishment and disadvantagement relative 
to the main-s tream social structure. Projects in the mos t 
impoverish d communities showed some of the most dramatic 
gains, but these were sometimes statistically unreliable and 
often confounded with comparison group problems. This trend 
may indicate the presence of a type of floor effect, but 
more likely it is associated with major differences in the 
social complexities of rural and urban corimunities. 
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• The amount of time the sponsor has had to refine and improve 
implementation of his treatment. In general, first-year 
impacts for 1970 samples (C-II) were stronger taan for 1969 
samples (C-I) . Although this trend is confounded by certain 
measurement difficulties associated with the first-year, 
Cohort I data, the differences appear large enough to support 
our interpretation. 

• The grade level of the pupil.^ and the amount of time they 
spent in the program. This interpretation If suggested by 

the fairly reguiar cumulative trend observed for the Cohort I-K 
samples (second-year effects were almost always stronger than 
firsc-year effects). Also, the effects on Cohort II-EF samples 
(pupils in the first grade) tended to be larger than those on 
Cohort II-K samples. Thes^^ trends do not obtain for Cohort I-EF 
samples probably because of the proportion of "good" matches 
in these samples was very low (i.e., 14 percent for Cohort I-E 
versus 50 percent for Cohort I-K). 

When the four trends evident at this interim point are combined, it 
appears that Follow Through has most often been successful in projects 
located in truly disadvantaged communities when there has been enough time 
to implement the model properly. In addition, the effects appear vcumuiative, 
and impacts appear stronger at higher age levels. 

Admittedly, we may be stretching the available evidence to generate 
these specific interpretations. But it is definitely no exaggeration to 
conclude there is evidence of impact. Furthermore, given all the uncon- 
trolled variation and lack of rigor represented in this "pseudo-experiment," 
it would not have been surprising if such evidence were altogether absent. 

This point deserves further clarification. At the outset of the 
Follow Through experiment some planners apparently expected that the im- 
pact of Follow Through on poor children would be so dramatic that it would 
be clearly evident, independent of sophisticated inferential statistical 
methods. That this expectation was overly optimistic probably should have 
been anticipated, since, in many instances, laooratory and field experi- 
mental data have yielded only moderate-sized effects under highly con- 
trolled conditions. In Follow Through, the treatments are administered 
by teachers and parents, nr^t by trained experimenters. Furthermore, we 
must assume (although we do not know) that some changes or losses occur 
because of imperfect Implementation of the models; perfect implementation 
could hardly be expected. And, since degree of impoverishment is a 
dominant eligibility component for obtaining an FT ^rant , understandably 
the truly poor schools within districts were quickly absorbed into the 
treatment. This made the task of finding comparable NFT schools and 
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families difficult, and occasionally impossible. Lack of comparability 
often meant that, to be clearly detected, a FT impact must emerge over 
and above a whole host of competing factors and extraneous sources of 
variance, many of which recently have been demonstrated as determining the 
great majority of pupil outcomes (see Coleman et al . , 1966; Jencks et al . , 
1972; Mosteller & Moynihan, 1972). In many laboratory and field experi- 
ments on educational treatments the results have been modest, often show- 
ing effects of a standard deviation or less even when competing factors 
and extraneous sources of variance are controlled or held at a minimum 
(Gray St Klaus, 1965; Hodges, McCandless & Spicker, 1967; Passow, in press). 

It is an understatement to argue that treatments developed in the 
laboratory would be suboptimal in a naturp.l sett-'ng. Hence we must ask 
the question, "How large do we expect the impacts of the various FT ap- 
proaches to be after one year? After two years?" For some of the models 
represented as treatments in this evaluation, results from laboratory and 
field experiments suggest an answer. But for the majority of the models, 
there is no way of even guessing. For those treatments that have been 
validated experimentally, the effects detected were generally approxi- 
mately one standard deviation — under highly controlled conditions (i.e., the 
mean of the treatment was not greater than two standard deviations from the 
mean of the contrdl). Without these controlled conditions, as in the FT 
program, the effects most likely decrease in apparent magnitude, and cor- 
respondingly their detection becomes less likely. 



The Need for More Precise Treatment Def initions 
and Descriptions 

We view the absence of careful and precise definitions of FT treat- 
ments and their associated delivery systems as one of the more serious gaps 
in the total Follow Through evaluation. By a definition of treatment, we 
mean operational statements of the specific manipulations the sponsor in- 
tends to implement within the project. By delivery system, we mean the 
actual materials and procedures he employs to affect this implementation. 
One basic reason for the Follow Through experiment is to provide a testing 
ground for the wide variety of both implicit and explicit hypotheses and 
theories that have been advanced to explain why disadvantaged pupils per- 
form poorly. These implicit or explicit theories presumably served as 
the bases for the intervention treatments, the effects of which are the 
subject of this evaluation. But in the current situation when we encounter 
program failures or negative results, we have no way of knowing, or some- 
times even guessing, why. It could be because the theory is wrong or 
because some or all of the treatment was not properly implemented. 
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The necessary linkage between the specification of treatment and 
evidence of impact is missing because at present we do not know the 
literal materials and procedures that a given sponsor employs in setting 
up his model in a particular project. It is true that the evidence ob- 
tained from the classroom observation procedures indicates that FT teachers 
tended to exhibit behaviors that, more often than not, were consistent 
with the relevant features of their respective models. But this evidence 
is insufficient for several reasons. First, this classroom observation 
evidence indicates the extent to which certain implementations are occur- 
ring; it does not tell us specifically how implementation was originally 
accomplished o:: specifically how it is maintained. Second, we currently 
have no rigorous independent evidence of the extent to which the presence 
of the observer influenced the behaviors of the teacher (pseudo-treatments ) , 
or the extent to which observer bias existed . Third , only a subset of 
models attempt to operate directly on the teacher and classroom process. 
For several approaches, the parent, home, and community are the vehicles 
by which the model becomes implemented.* 



Thus, if the longer-range evaluation of Follow Through is ;:o have 
any payoffs in terms of the identification of "significant" treatments 
or treatment components (i.e., those treatment components to which signif- 
icant outcomes have been or can be attributed), necessary descriptions of 
treatments and delivery systems must be collected and classified for ap- 
propriate future analyses. Without these data, it is very unlikely we 
will ever understand the reasons that certain programs worked, and even 
more unlikely that we will be able to describe how to "export" successful 
approaches zo other contexts or situations. 



Competing Goals of Follow Through 

It is apparent that conr iderable conf us ion and ambiguity exis ts con- 
cerning the goals of FT and its concomitant evaluation. The same array 
of stakeholders are present as were present when the program began , and 
a few others have joined. The major bifurcation is between those, on the 
one hand, who (consistent with the intent of the original legislation) 
believe that the Follow Through program was designed to enable low-income 



These comments are not intended to detract from our previously stated 
confidence in the reliability of classroom observation findings. Indeed, 
our observed patterns were anecdotally corroborated by sponsors and other 
observers, and the agreements amon^, raters appeared reasonably high, sug- 
gesting that observer bias is not a major factor. 
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families to participate in the administration of comprehensive pervices 
being provided to their children through the school and those, on the 
other hand, who wish to find out the ways in which 20 or so ed'Toa clonal 
innovators with different educational philosophies and techniques can 
improve the level. >^^of achievement and enhance the self concepts of poverty 
childi en . 

The former group, which includes not only most parents but also some 
of the Local Education Agencies and even a sizable group of Federal program 
officers, appears positive toward Follow Through and seems convinced that 
the programs are accomplishing the goals. Our own data show that FT 
families become involved in their child *s education and regard the program 
positively. The interim results do not yet allow us to say whether early 
results on parental variables reflect a general phenomenon like the "Haw- 
thorne effect." Parents certainly appear pleased tha-^ something is being 
done for their children even when they are unsure of the nature of the 
programs. It may be that parents' responses indicate that they are aware 
of effects on their children that have not yet been detected by the evalu- 
ation data. 

In any case, evaluative evidence is not of crucial relevance to 
this group. They feel Follow Through is a demonstration of what can be 
done when everyone works to improve conditions. Understandably, to these 
stakeholders, the requirements of experimental rigor must appear irrelevant 
and superfluous. These attitudes, though beneficial to the program in 
many ways, become deleterious when we seek to establish comparison groups 
whose performance we contrast with that of FT groups as a measure of pro- 
gram effectiveness. 

This interim evaluation has shown that for several projects the 
comparison group samples consist of pupils and families not at all like 
the FT project samples. We know that reasonable effort was extended in 
attempting to obtain proper matches for experimental purposes. But not 
only are truly poor families within a project disproportionately repre- 
sented in the FT groups (since they compose the subgroup for which the 
program was intended), but also it is difficult to see how similar and 
eligible families within the same district could long be prevented from 
either (a) enrolling their children in FT schools or (b) encouraging 
their own schools to adopt FT-like methods. Both of these actions are 
natural, even desirable, but they have negative consequences on the 
longitudinal evaluation design. In fact, a serious question arises as 
to whether or not we should continue the expense of data collection for 
projects where attrition and matching problems are severe. 
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Most compromised by the program-versus-research dilemma are the 
sponsors. On the one hand the sponsors are advocating intervention models 
and providing delivery systems, so they have a definite stake In a good 
evaluation. They would want the evaluation design to have as many safe- 
guards as possible to avoid biased or invalid assessments of their effec- 
tiveness. Most are also interested in reliable and useful feedback 
information on outcomes so that they can improve their approaches. On 
the other hand, they are sensitive to the importance of participant "good 
will" to make their approach successful , or , in some cases, to make its 
implementation even possible. In this sense, the sponsor must consider 
his every action in terms of its consequences not only on the overall 
evaluation of his model , but also on how it affects relationships with 
the parents who have served as classroom aides and who, as PAC members, 
have shaped the programs to fit the community and vice versa and who, 
the sponsor may believe, are crucial to the effectiveness of his model. 

Problems in Measurement 

Another serious impediment to clarity of evaluation results appears 
to be in the very nature of the kinds of results many of us are looking 
for. For example, the simplest way to increase the apparent clarity of 
results is to utilize measures that are both reliable in themselves and 
externally valid (sensitive to treatment effects). Achievement measures 
are generally reliable and face-valid, but they are differentially rele- 
vant to various program objectives. That is, nearly all models purport 
to have impacts on pupil achievements, but descriptions vary as to how 
and when these impacts will emerge. Similarly, in all FT models, the 
development of positive affect in one form or another is considered a 
goal. In some FT approaches, the emergence of this positive affect is 
considered necessary for meaningful learning to occur. In other approaches, 
a positive self-image is considered a consequence of successful academic 
achievement. Unfortunately, regardless of one's theoretical predisposition 
toward the construct , currently available measures of pupil affect show 
poor — almost unacceptable — reliability (as do most noncognitive measures). 
Since validity is dependent on reliability, these measures are not highly 
useful in their current state of development ♦ 

Another strategy might be to compile separate assessment batteries 
tailored to the goals and objectives of each model . This strategy would 
certainly enhance the validity of conclusions in the evaluations of in- 
dividual approaches, but would seriously impair, if not preclude, one's 
ability to draw inter-approach conclusions. In fact, early attempts to 
construct a comprehensive evaluation battery made extensive use of 
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sponsor-contributed items, but in the long run this procedui^e satisfied 
no one and, by 1971-72, was abandoned in favor of standardized or "off 
the shelf" instruments.* 

Establishing the temporal relevance of evaluation measures to the 
long-range goals of the program presents still another problem. The 
general objectives of improving the life chances of poor children suggest 
that what are needed are sets of measures that validly predict (or diag- 
nose) the long term life chances of such children. In addition, these 
measures would have to be appropriately sensitive to the effects of 
specific treatments . The only set of measures that even comes close to 
meeting these requirements in the current interim data are the achievement 
scores. But current research (Jencks et al . , 1972) has called even these 
measures into question in terms of their predictive validity and utility. 

We feel that this problem of predictive validity of the evaluation 
measures will probably not be resolved in the near future. What seems 
more likely is that appropriate developments and advancements in theories 
of instruction should lead to the specification of criterion skills and 
behaviors, the attainments of which will be both testable and useful as 
ends in themselves. Similarly, we expect that the ambiguity currently 
associated with noncognitive objectives will be resolved either by speci- 
fying theoretically relevant criterion behaviors or by abandoning the 
domain as impractical . 

Another related measurement issue is the need for clarification of 
certain paradoxical or equivocal response patterns obtained from our 
survey instruments. The most striking example comes from teachers' 
responses . FT teachers tended to indicate that they approved of their 
current teaching methods and procedures; that is, they were less inter- 
ested in alternatives than were NFT teachers. Yet FT teachers also tended 
to answer that meeting with parents was less essential than the NFT teachers 
believed it was. Almost without exception, FT methods prescribed greater 
adult participation in the classroom. If we are to assume these measures 
are reliable and valid, then at least three interpretations can be advanced: 

• FT teachers are pleased with parent assistance, but they do not 
feel parents should be directly involved in educational activ- 
ities. 



* 

The current (1972-73) battery consists of the Metropolitan Achievement 
Test (1970 edition), the Progressive Matrices Test (problem-solving), and 
selected noncognitive instruments (the I.A.R., Gumpgookies , Locus of Con- 
trol , and Coopersmith Self-Esteem) . 
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• FT teachers are pleased with parent ass is tance and s ince they 
are in classroom contact with them, the teachers see less need 
for outside contacts. 

• FT teachers are not pleased with parent participation and are 
responding to other factors on the "professional acceptance" 
measure . 

Our problem is that, based on the current data, \v9 have no way of 
choosing which of these (or possibly other) interpretations is correct. 
But since the measures often reliably differentiated between FT and NFT 
teachers, it would be unfortunate if at least some follow-up activiticiS 
were not instituted to resolve these ambiguities. 

A similar example from the parent interview variables is found in the 
"satisfied with progress" and "academic expectations" measures. On these 
measures, FT parents more often indicated satisfaction with their children' 
progress than did NFT parents. On the other hand, NFT parents tended to 
have higher overall academic expectations for their children than did FT 
parents. If we accept these results as valid, then FT parents definitely 
have lower aspirations for their children (i.e., they tend to be more 
satisfied with less progress and are less optimistic about future growth). 
On the other hand, FT parents may be more realistically appraising the 
educational situation of their children, whereas NFT parents are less 
realistic. Again, additional information would be useful for resolving 
this interpretive problem. 



Attrition and Its Implications 

Some preliminary evidence suggests that some rather serious problems 
will likely be encountered in the final stages of this longitudinal evalu- 
ation because of apparent patterns of differential FT/NFT attrition. 
Our current tracking data show that thej half-life of the comparison sam- 
ples is about two years. (That is, the size of the comparison sample 
reduces by one half every two years, on the average). Thus, by the end 
of four years in this longitudinal study, only about one-fourth of the 
original comparison group is expected to be available for assessments. 
Although the attrition rate is considerably smaller for FT (half-life of 
about three years), it will be very difficult to draw valid inferences of 
effects, given what is likely to be a nonrepresentative residual of an 
already biased comparison sample. 

It might well be prudent to begin considering alternative strategies, 
one of which might involve a shift to crossrsectional matched sampling, 
for the outcome data collection among exiting cohorts. This shift in 
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strategy would, of course, mean that the longitudinal nature of the ex- 
periment would be replaced, in some (or, possibly, all) projects, by a 
broader based, cross-sectional model. Moreover, our regression analyses 
have shown that over 60 percent of the variance^ accounted for by all co- 
variables can be attributed to parent/home/environment indexes. Thus, 
if careful matching is implemented , a cross-sectional design might restore 
most of the power initially represented in the longi tudinal des ign but 
lost because of attrition. 



Summary of Recommendations for the Continuing FT Evaluation 

On the basis of evidence and discussion presented in the body of this 
report, we advance the following recommendations, which we feel will en- 
hance the quality and utility of the final FT evaluation. 

(1) Clear and precise operational definitions of sponsor "treat- 
ments" and equally precise descriptions of their delivery systems 
are needed to improve understanding of how and why these treat-* 
ments are or are not effective. This information should be ob- 
tained as soon as possible, since it will be relevant to the 
evaluation of currently exiting cohorts, and essential if the 
findings are to be applicable to new situations. 

(2) Analysis of statistical power for detection of treatment effects 
at the project level indicates that the number of classrooms per 
project is insufficient for the detection of moderate small 
effects. If possible, the number of classrooms (FT and NFT) per 
project should be increased." 

(3) For many projects, NFT comparison groups were not considered suf- 
ficiently similar to FT groups to enable a valid analysis of 
program impacts . Unless comparability of these comparisons can 
be improved by means of alternative designs (e,g., matched pairs, 
cross-sectional), we recommend that these '^mismatched" projects 
be deleted from subsequent data collections . 

(4) The magnitude of the attrition problem appears to be far greater 
than initially estimated, particularly for the NFT sample. It 
appears that over a four-year duration, NFT pupil attritions range 
as high as 80 percent of the baseline sample. We strongly recom- 
mend alternative evaluation des igns be studied for poss ible 
adoption in the near future. One highly feasible alternative 
appears to be a matched pairs, cross-sectional design. 
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Although we believe that the current (1972-73) evaluation bat- 
tery is adequate in many respects, the possibility of gathering 
sponsor-specific measures should be reconsidered. Given the 
divergent goals of the different approaches , such measures would 
provide alternative bases on which to assess impacts . 

Al ter natives to the current noncogni tive measures should be 
considered. One such alternative might be the use of classroom 
observation data to index pat teiiis of personal and social growth 
and development . 

The classroom description component (CCD of the classroom ob- 
servation should be utilized more frequently and extensively in 
subsequent data col lections to better characterize overall class- 
room practices and emphases. The possibility of continuous sam- 
pling on case study bases should be considered. 

Further investigation of teacher and parent attitudes and be- 
haviors is needed to resolve ambiguities in a number of important 
effects noted in these interim data. 
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Recommendations Concerning Evaluations of Future Education 



and Social Action Programs 



The Need for Planning the Evaluation 

Evaluation of Follow Through has demonstrated to us that it is essen- 
tial to have a formative period bolTore a summative evaluation of any 
social action program is begun. The formative stage is required to clarify 
the purposes of the evaluation. It is also needed for defining evaluation 
questions and hypotheses, clarifying and understanding the nature of the 
alternate treatments or procedures that will be assessed, and assembling 
and sharpening the tools of measurement. Hopefully, in addition, some of 
the conflicts common to all social experiments can be explicitly recognized 
and reconciled during the period. 

One issue that demands attention during the planning stage is the 
stace of the art in both measurement and statistical analyses. The temp- 
tation in undertaking an evaluation is to assume that procedures can be 
rapidly developed or that ways to deal with problems of measurement and 
analysis will emerge. There is, however, great danger in basing one's 
plans on the assumption that such developments will occur, A simple 
example can be cited from Follow Through. While the importance of measures 
in the noncognitive domain was clearly recognized, neither the limita- 
tions in existing instruments for use in large-scale measurement appli- 
cations nor the difficulties inherent in effecting new developments were 
equally clearly recognized, and certainly not on a schedule that would 
be useful to the evaluation. Evaluation planning, in short, should be 
limited to the state of the art as known or best estimated at the time 
the evaluation begins so that unrealistic expectations about new develop- 
ments can be avoided. If it is decided that development efforts and 
evaluation activities should be undertaken simultaneously, these activi- 
ties should not _be made time-dependent on one another. The evaluation 
can be used as a vehicle for development, but the pace of development 
should not be tied to the evaluation schedule, and the evaluation should 
not depend on products from the development activities, 

A major lesson in evaluation, learned from Follow Through, stems 
from the difference between controlled experiments and naturalistic 
studies in the "assignment" of subjects to treatments. The wrong decisions 
about the communities to receive treatments, the school teachers and pupils 
within these communities to participate, and the mechanism by which pro- 
grams are effected can complicate or seriously jeopardize the quality of 
an evaluation. There are ways to improve the design of social experiments 
similar to Follow Through, and thereby strengthen the power of the study, 
without doing disservice to poor children. Even when no options in the 
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selection of participating communities are available, it should be possi- 
ble to shape the procedure by which program and district affiliations are 
determined so that each type of program would be balanced in its project 
representation according to geographical location, type of community, or 
other relevant dimensions. Within each project location, it would be 
possible to establish both experimental and control groups sin.al taneously 
rather than leaving the problem of controls to be resolved later. In 
addition, it might be possible, within a large-scale program, to require 
that all programs that are to be evaluated be adopted by a certain minimal 
number of sites or groups and also that all sites wishing to receive grants 
choose a program from among the several to be evaluated. 

There is great value in a stable external panel of advisers to pro- 
vide continuity and counsel. Such a panel need not be fixed in its mem- 
bership; panel members can be replaced from time to time to avoid some 
of the dangers of proprietorship or parochialism. The important con- 
sideration is that of an advisory body to help both the sponsoring agency 
and its contractors recognize and reconcile problems of planning, oper- 
ations, analyses, interpretation, and reporting. To the extent that such 
consulting bodies can be influenced to accept accountability for the re- 
search products, their utility will be enhanced. 

The Follow Through evaluation to date has ignored costs as an evalua- 
tion variable to be considered systematically in analyses. The implica- 
tions of program costs for policy decisions are too great to ignore; any 
social experiment should incorporate studies of cost/effectiveness and 
cost/benefit analyses as part of their plan from the outset. 

One aspect of project costs that concerns evaluation efforts is the 
possible need for a quid pro quo account in the budget. Such funds 
would be drawn upon when the evaluation design requires the cooperation, 
for control purposes , of groups that do not receive program grants , Ri sk 
beyond what is necessary is entertained if one must rely solely on persua- 
sion and diplomacy to win participation from groups that do not receive 
any project funds. 

Critical Dimensions I'.or Evaluation Planning 

The alternate positions on dimensions such as those listed in the 
following paragraphs should be clearly stated and the implications of 
alternatives for policy-making should be understood in detail before an 
evaluation design is even attempted. The evaluator must recognize that 
for any position chosen he either sacrifices at one end of the dimension 
to realize gains at the other or he must expand the purposes of the study 
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and allocate additional resources to cover separate efforts. 



The Service Orientation versus the Research Orientation — For example, 
a service orientation toward Follow Through would dictate assuring that 
only the most needy were participants in the program. In contrast, an 
experimental orientation would either assure equal neediness for partici- 
pants and nonparticipants or would permit variations on the impoverishment 
scale so that a "treatment by poverty" interaction might be identified, 
A service orientation would also argue for a standard presentation of 
services to all who qualify, whereas an experimental orientation ycjuld 
encourage a greater variety of services. F'inally, evaluating a program 
under a service orientation would require some pre-defined standards 
against which program success could be measured. In contrast, an experi- 
mental orientation would more likely ask whether inter-treatnient differ- 
ences existed, and if so, where, to what degree, and so on. 



The Policy Orientation versus the Theory Orientation — For purposes 
of deciding on continuation or termination a pol icy— maker might ask, 
"Does the program work?" A theoretician is more likely to be concerned 
with tfie conditions, including treatment variations, under which particular 
effects are observed. An investigator with a policy orientation would 
suggest contrasting input levels with output levels without necessarily 
trying to ascertain what happens during the process. The theorist, while 
not uninterested in input-output differences, is particularly concerned 
with the mediating processes. A policy-maker is also more likely to ask 
cost effectiveness and cost benefit questions about the data, whereas a 
theorist may, in many cases, ignore the cost variable. 

The Formative Orientation versus the Summative Orientation — If one 
decides that formative assessment is most important, he is by that choice 
encouraging the program to change as it grow.s in response to frequent and 
fairly rapid feedback. Summative assessment, on the other hand, is more 
congenial to a stable treatment observed over a sufficiently long period 
to permit conclusions to be drawn about the whole program or the relative 
strength of fixed alternatives. Formative* assessr.vjnt is most appropriate 
for those conditions that exist when designing a sys'tem is the primary 
objective, and summative assessment is most appropriate to condit j.onsii 
where the objective is to test the worth of a < -ribable system. 
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General versus Specific Criteria of Success "-- The dichotomy between 
general and specific criteria might be referred to as abir^tract versus 
concrete results, or less measurable versus more measurable, outcomes , 
or broad versus narrow criteria, or aggregated and gross outcome measures 
versus disaggregated measures of individual outcomes. in other words, ±i 
one wishes to know whether experimental approaches are effective in reach- 
ing their own goals, one chooses different sets of criteria in different 
time frames than if one wishes to assess all approaches on a single set 
of effectiveness criteria. 

Frequent Reporting versus Deferred Report ing —'The question of frequent 
versus deferred reporting is related to two questions discussed earlier- 
formative versus summative assessment and policy versus theoretical orienta- 
tion. |!f findings are reported frequently, then the risk of premature 
conclusions is increased. On the other hand, if reports are deferred too 
long or excessively qualified when issued, their utility for the policy 
maker may be lost entirely. Perhaps the balance can be struck by acknow- 
ledging the legitim". y of each class of report, for different uses. In 
addition, thoughtful consideration t|0 time constraints must be given very 
early, if the info rmat ion provided to decision-makers is to be useful. 

RecoTTimondation for Specific Policy Decisions Regarding Compensatory 
Education Programs 

Sponsor Programs 

The primary focus of this evaluation of Follow Through is the assess- 
ment of the effectiveness of the sponsored programs. The most general 
finding is tlhat, on the measures obtained over the mere two years of the 
evaluation (mostly with kindergartners and first graders), there is not 
yet evidence that one or several sponsors' programs stand out as consis- 
tently superior to the comparison program. Arguments have been advanced 
that the expectation of finding such a result at this time is probably 
premature and overlv optimistic because of the amount of uncontrolled 
variation, difficulties in implementation, and problems in obtaining 
adequate comparison groups. 

To speculate about eventual outcomes, let us imagine for a moment 
that the planned evaluation is complete and that a good match was achieved 
in several projects for each sponsored program. On the basis of present 
trends, the most reasonable guess is that, for example outcomes for 
third-grade, Cohort III participants would still reveal that no sponsor 
was effective in all projects. Sponsors with highly academically struc- 
tured classrooms would more often reveal superiority on tests of achievement, 
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but it is likely that in some projects, the NFT classrooms would equal 
or exceed the FT classroc^ s on ir.ean achievement. Child-centered classes 
with more self-regulatory activities might prove superior to their com- 
parison groups on some measures of problem-solving ability and locus of 
control, but their performance on overall achievement tests such as the 
WRAT would probably vary considerably. We would probably also find some 
reversals such that NFT groups exceeded FT groups, Cv^en in those areas 
where the models were well implemented. \ 

'J 

Even if such projected results occur in a ^^ra^^^iJiluation , one 



^^(^uation , 
produc 



could still probably identify some sponsored P^^^^^^^S^ produced sub- 
stantial impacts on relevant variables across/ wo sites. What 
decision would then be justified? Soma poi i^y.^^^^^^^^assumo that if 
such successful approaches cov.^d be .idxrnti.Tied , the: finding would provide 
a direct guide to action. Buc finding that a sponsor, a community, and 
a school district can work together effe':tively is not the same as dis- 
covering the precise conditions that enabled their Joint success. True, 
a sponsor who has been associated with several sites producing good results 
is likely to be doing something right. But without a more comprehensive 
catalog of the variables included in the treatment, and the identification 
of these variables in action across sites, no one could say what was right. 
So j^rescr ipt ions for compensatory education programs other than Follow 
Through itself are difficult to derive. 



If the following expectations are held by policy makers, they should 
be discarded as unrealistic: 



• The sponsor has a "package'' — a clearly delineated set of training 
manuals, administrative procedures, curriculum materials, and 
accountability mechanisms-- which can simply be applied again to 
another school district to bring the same results. 

• The sponsor has identified the crucial variables of his model and 
can share all the essentials with others, and thus, the "package" 
can be applied by people other than the sponsor himself^ 

• It is the sponsor and his model that contain the key variables 
determining success and failure. Furthermore, we already know 
certain factors to be unimportant . 



By laying stress on the model and its transferability these viewpoints 
ignore the possibility that, for example, the nature of the Federal-local 
funding relationship and the nature of the school district-sponsor court- 
ship are potent determiners of success or failure. Effects of societal 
events surrounding the schools (growth of Chicane, Indian, and Black 
pride; teacher underemployment; teacher unions, school finance inequality 
controversies) are also not yet adequately considered as crucial factors. 
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For those sponsored programs that repeatedly fail to produce evidence 
of impacts across sites, we must explore the reasons for these failures. 
We should establish, to the extent possible within evaluations like Follow 
Through, what contributions are required from teachers, parents, and school 
administrators, what amounts and types of federal-state-local interac -.ions 
are necessary, and how much funding it takes to achieve the desired out- 
comes. 

What of other compensatory education programs, those that are cur- 
rently being planned by federal or state education agencies? Have we pro- 
vided any guidance at this interim stage? At this point, several of the 
FT projects might be looked on as successful demonstrations. Since it 
is no^ clear which are the crucial and which the superfluous factors, 
however, the new program could only attempt to incorporate as many aspects 
of the "successful" projects as can be identified. These common elements 
might include the same sponsor (or some similL-^r third-party change agent), 
who had been: a) successful with a similar population, b) invited by 
school district officials in consultation with parents, c) accountable 
to the funding agency as well as to the local district, d) funded at the 
same overall level, and e) provided with the necessary and sufficiently 
committed support staff. If these conditions cannot be established, { 
no pi'escription oi the treatmont seems justified. 

\ 

The Overall Evaluation 

While no direct process comparisons were made between FT and NFT 
classrooms, project by project, \vg have determined that, on the averagr, 
sponsored classrooms differ from comparison classrooms on st;veral process 
dimensions. Paient and teacher data, and community studies as well, 
give us strong reason to believe that something potentially powerful is 
happening. On the question of overall impact of Follow Through, however, 
we have as yet only a little positive evidence that the effects are cumu- 
lative. Although a '*go/no-go" decision on the FT experiment is not known 
to be pending, the evidence so far would favor continuation of the project, 
even though the current lack of comparability of experimental and control 
groups and the lack of treatmen"^ specification argue for substantial 
changes in the evaluation design. 

Project Level Factors 

Any action taken on the basis of present evidence to extend, modify, 
or terminate projects on the basis of interim outcomes (aside from im- 
proving matches with the comparison groups) could have a dramatic effect 
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on the outcome of the overall program and, thus, on its future evaluation. 
We have assumed that the internal (e.g., curriculum, personnel) components 
were key treatment variables, but factors such as the locus of decisions- 
making power in a project may be even more influential. It seems clear 
from the evidence in the Community Studies report (SRI, 1972b, Appendix E) 
and observation of parent groups in action that parents are quite sensitive 
to the possibility of arbitrary decision-making. Changes in unwritten 
social contracts without parents' participation may violently shift parents' 
roles in Follow Through. The change in parents' attitudes that would 
occur if projects were dropped on the basis of. an evaluation they do not 
consider valid might radically affect the impact of the program in the 
sites remaining. 
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Arnex A 



ISSUES IN THE ANALYSIS OF THE DATA 

This annex presents data and discussions on two fundamental analy- 
sis issues: 

• The unit for analysis 

• The method of analysis. 

Part 1 discusses the use of classroom data as a basis for analysis of 
program effects. Part 2 discusses issues in the selection of ANCOVA as 
the method of analysis. We have attempted to present these issues and 
arguments in a manner that' does not require advanced knowledge or expe- 
rience on the part of the reader for appreciation or understanding. 

Part 1: The Use of Classroom-Level Datar as a Basis for Analysis 
of Program Effects 



The unit of analysis chosen xor estimating interim program effects 
was the classroom. For our current evaluation purposes, the classroom 
is the most appropriate unit because of the locus of treatment, the 
nested design and confounding of different levels of nesting, the levels 
of measurement, and the problems of missing data and program attrition. 
Where appropriate and practical, we discuss below the advantages and 
difficulties associated with alternative procedures for dealing with 
these problems. 

Locus of Treatment 

The locus of treatment in Follow Through is the classroom. Although 
the program is targeted for poor children, it is implemented and admin- 
istered on the classroom level (or some equivalent administrative group- 
ing). We recognize that teachers and aides respond differentially to 
pupil needs and characteristics, and thus, the "treatment" may be quite 
different from one pupil to another. But since we have no systematic or 
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reliable way of defining or classifying such treatment variations, we 
must assume that they are independently distributed. 



The Nested Design and Confounding at Different Levels 
of Nesting 



A very important consideration is the relative independence of 
differential treatment effects within classrooms compared to those across 
c lassrooms . We believe it is reasonable to assume that treatment effects 
are correlated to a much greater degree within classrooms than across 
classrooms; that is^ what the teacher does with individual pupils or 
groups of pupils in a given classroom will have some impact on all pupils 
in that classroom but will not necessarily be expected to have any impact 
on pupils in different classrooms. This means we cannot legitimately 
pool pupils across classrooms without first estimating (and partitioning) 
this classroom effect. This problem can be dealt with in at least three 
different ways: 

• By organizing the data into a hierarchical design with 
classrooms, and perhaps schools, as design variables. 

• By considering each classroom as a separate treatment 
and pupils as units in a one-way design. 

• By assuming with in-classroom error variance as normally 
and independently distributed (NID) , and aggregating 
pupil data to the classroom level, so that each class- 
room represents an independent estimate of program effect. 



Each of the above three alternatives presents advantages and dis- 
advantages. The first, hierarchical design, allows for separate esti- 
mation and partitioning of effects at each level of xiesting. Statisti- 
cally, it can be considered the most appropriate. However, this model 
requires balanced frequencies at each successive level , which our data 
do not satisfy. Moreover, unweighted means solutions (Winer, 1962, 
Searle, 1971) are appropriate for dealing with only slight imbalances, 
whereas in our case the problem is acute. Thus, we concluded that this 
alternative (hierarchical design, unweighted means solution) offered no 
real advantages over simple aggregation to the classroom level (which 
is a special case of unweighted means) . 

The second alternative, considering the classroom as the experiment 
with pupils as the units, was rejected as inconsistent with the evalu- 
ation design and objectives. Follow Through is designed as both a school 
and community level program. The fact that the educational component is 
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implemented at the classroom level is incidental to the program; it is 
more a function of the administrative structure of elementary education. 
Moreover, the concept of planned variation requires that we assume (or 
test for) some regularity in treatment properties from classroom to 
classroom within projects. 

The third alternative, aggregating pupil level data to the classroom 
level under the assumption of uncorrelated and normally distributed "errors,'* 
has it J own advantages and disadvantages. The most obvious disadvantages 
are intuitive ami interpretive. On the intuitive level, such a procedure 
appears to shift evaluation focus from the individual child to the class- 
room, and to reduce statistical power and precision by decreasing obser- 
vations (there are certainly more children than classrooms). 

The first of these concerns can be met with the observation that 
Follow Through was not designed as a clinical program whose success would 
be based on case studies. Instead, FT is designed to assist poor or dis- 
advantaged children to attain i:asic skills and attitudes necessary for sub- 
sequent academic and social growth, and we are evaluating the program 
in terms of these children (and their parents and teachers) as a group. 
Moreover, our measures of outcomes, such as standardized achievement 
tests, are primarily appropriate for group-level decisions. We admit 
that evaluation based on criteria of individual successes would be more 
desirable, and we anxiously await the development of such methodology 
for large-scale research, 

Th^ second disadvantage, reduction of statistical power and preci- 
sion due to reduced observations, is relevant to the extent that we have 
relatively few classrooms with which to start. The reduction in obser- 
vations associated with the aggregation of pupil measures to classroom 
levels is somewhat offset by the corresponding reduction in variance 
presumed to exist at the classroom level, which, in the pupil level 
analysis, would appear as error variance. Also, classroom level vari- 
ables enable precise estimation of fallible data for use as covariables. 
(Under the assumptions of classical reliability theory, the point esti- ' 
mate confidence intervals reduce as the or the class size. Hence, 

classroom-level data are more appropriate for covariance adjustments.) 
In other words, assuming substantial variance can be attributed to class- 
room differences (over and above program effects), the net effect of the 
two approaches (classroom analysis and pupil analysis) should yield equiv- 
alent statistical power and precision provided sufficient classrooms are 
available for estimation of variance components. 
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The same reasoning that leads to the aggregacion of measures from 
pupils to classrooms applies equally to all higher levels at which we 
suspect that effects other than those due to treatment may exist. For 
example, we could continue the aggregation of scores from pupils within 
classrooms on up to classrooms within schools, and finally, to schools 
within projects. However, each such aggregation carries with it all pre- 
ceding assumptions and the corresponding additional assumptions at the 
level involved. Moreover, the effects of such higher level aggregation 
on power and precision become so dramatic that a random effects model is 
required for aggregation beyond the classroom. But since district and 
project nearly perfectly covary, the only advantage that such higher ag- 
gregation would offer is that of removing school variance from treatment 
variance. And since we believe schools within projects or districts are 
reasonably well matched, we prefer to assume school variance is dominated 
by classroom/teacher variance. Hence, the classroom appears to be the 
most appropriate analysis unit. 



Variations in Levels of Measurement 

In the previous sections on instrumentation and variable definition, 
we presented categories of measures obtained at child, parent, teacher, 
and community levels. Aggregation to classroom-level represe^itation 
adjusts child and parent variables to the same level as teacher variables. 
This result is important for purposes of comparability of analysis and 
interpretation of results, and it enables integration of variables across 
categories for control purposes. We are still unable to integrate 
community-level measures with the pupil/parent/teacher, but in project 
by project analyses, these variables remain fixed and, thus, are not of 
concern. 



* 

A random effects model requires that we assume units were drawn at 
random from a larger population about which we wish to generalize 
, and draw conclusions. A case can be made for employing random effects 
analyses with projects as the unit if we wish only to test national- 
level hypotheses concerning FT as a single treatment (i.e., FT versus 
NFT overall) . 
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Problems of Program Attrition and Incomplete 
and Missing Data 

One of the most difficult — often disastrous — problems occurring in 
large-scale research programs is that of data loss. Our experience in 
Follow Through , where a large number of measures are repeatedly gathered 
for large groups of respondents, is no exception. The fact is that every 
data point lost because of incomplete measurement or subject mortality 
(attrition) actually alters the basic research design (i.e., changes the 
experiment in essentially unpredictable ways). If we were to restrict 
our interim analysis to those cases for which all data on all measures 
are present and presumed valid, we would most certainly be faced with 
a greatly reduced and most likely nonrepresentat ive subset of cases in 
terms of our original samples. 

To the exte:it that missing or incomplete data are random events — 
bo th within and across measures — our procedure of aggregation to classroom 
level data represents a partial solution to this problem. Likewise, if 
program attrition is randomly distributed across subgroups (FT/NFT) within 
projects, aggregation to classrooms will tend to reduce- the apparent im- 
pact of this attrition; i.e., even though some or most pupil data within 
a classroom are lost to attrition, under the assumption that the attrition 
is random, an unbiased mean can still be estimated for the classroom from 
the remaining pupil data. 

Still another aspect of concern in aggregating data within classrooms 
is the general migration patterns of pupils from year to year across class- 
rooms and teachers. For the interim analysis, this becomes an issue only 
for Cohort I data, in which pupils are in their second experience year in 
Follow Through. Moreover, the problem is considered more crucial for the 
NFT classes, where redistribution of pupils across classes from year to 
year is likely to be more pronounced than in FT schools, where pupils tend 
to progress in relatively intact classroom groupings. 

Each of these data aspects received considerable attention in the 
preparation of data for analysis. The procedures used for handling missing 
data and for examining effects of classroom migration and attrition are 
presented below. However, since attrition and associated effects are 
currently being examined in depth in a separate study, only preliminary 
consideration will be given in this report. 
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Of the general methods for handling missing data problems (dis- 
regarding it, statistically adjusting for it, and supplying it) we chose 
direct imputation as the most expedient method for this interim evaluation 
A systematic step-by-step procedure was followed in constructing the data 
matrices, beginning with the construction of pupil-, parent-, and teacher- 
level variables. Specifically, pupil-level data were restricted to those 
cases for which baseline data were present (roster and test battery) and 
for which WRAT and Affect posttests were administered. Data for these 
cases were combined into appropriate dependent and control variables 
according to the def i x^ii tions described earlier (see Definition and Develop 
ment of Variables). Variables werj computed only for those cases where 
90 percent or more of the component data (e.g., item scores) were present. 
All data were then aggregated on a variable-by-variable basis to the class 
room level (with the exception of the teacher variables, which already 
existed at the classroom level), where classroom was defined in terms of 
the Spring 1971 Rosters.* 

The next step was to examine the relative completeness of the 
classroom-level data. Two aspects of this examination were: 

• Asses sment of the relative comparabi 1 i ty of classroom- 
levol scores obtained for the identic subset (those 
members of a given classroom for which both pretest 
and posttest data were '^complete") and the total 
classroom average. 

• Assessment of the proportions of missing data on a 
variable-by-variable basis across classrooms . 

The results of the part-whole comparisons for classroom-level 
values of the pupil outcome variables are summarized in Table A-1. These 
results indicate the magnitude (in items) and significance level of the 
average differences between the classroom means of those pupils having 
both pretest and posttest data and the means of all pupils in the class- 
room* These comparisons reveal that, for the baseline data (p-."etes ts) , 



Note that in terms of baseline data, classroom grouping is wholly 
artificial, since these measure are gathered prior, or close to, the 
commencement of formal instruction and treatment administration. How- 
ever, classroom grouping for the second experience year of Cohort I 
(Spring 1971) does ignore the grouping for their first experience year 
(Spring 1970), which may involve carry-over consequences. 
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Table A-1 



CLASSROOM MEAiNS OBTAINED FOR PRE-POST IDENTIC SUBGROUPS VERSUS 
TOTAL CLASSROOM POPULATIONS ON BASELINE AND OUTCOME MEASURES 



GROUP 


VARIABLE 


WRAT 


QUANT 


PROC 


READ 


LANG 


AFFECT 


COHORT I-K 


PRE A X 


- ^ 


.05* 


*• 


.02 


- . 02 




.08 


- . 01 


- . 03 


(N = 337) 


4- 


- ^ 


.42 




. 30 


-1.31 




.63 


-.36 


-1.09 




POST A X 


-1, 


.07 




.66 


-.06 


-1 


.09 


-.57 


-.08 




t 


-4. 


.95+ 


-4. 


. 45t 


-2.51* 


-5 


.30t 


-4.62t 


-2.05* 


COHORT I-EF 


PRE A X 




.05 




. 03 


NA 




.01 


-.02 


.02 


(N = 118) 


t 




.86 




.99 


NA 




.24 


-1.02 


1.57 




POST A X 




.30 




.21 


NA 




.61 


-.16 


-.07 




t 


-1. 


.26 




.95 


NA 


-1 


.26 


-.97 


-1.38 


COHORT II-K 


PRE A X 




.26 




.05 


-.03 




.01 


-.30 


.11 


(N = 93) 


t 


1, 


.29 




.38 


-.39 




.02 


-1.83 


.29 




POST A X 




.04 




.20 


-.13 




.55 


-.31 


-.58 




t 


-1, 


.25 


-4, 


.47t 


-4.80t 


-3 


.80+ 


-.4.08t 


-3.94t 


COHORT I I -EE 


PRE A X 




.14 




.27 


.13 




.91 


.55 


.66 


(N = 34) - 


t 




.84 


1. 


.57 


.76 




.59 


.71 


.61 




POST A X 




.01 




.42 


-.05 




.41 


-.28 


-.49 




t 




.22 


-4. 


.96 + 


-1.58 


-3 


.2lt 


-2.66* 


-3.78* 



NA = not applicable. 

* 

Entries represent the mean difference of total iriinus identic scores. 
Thus, a minus sign indicates the identic means are greatar than the 
total class means. 

t 

p < .01. 
P < .05. 



means computed from id'^ntic and total classroom samples are not signif- 
icantly different. This is important because it establishes to some 
extent the lack of a systematic bias in the nonattrlted pupil sample 
in terms of baseline or incoming characteristics. 
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The posttest comparisons of identic and total class means indi- 
cate that identic pupils score higher than nonidentic pupils. One inter- 
pretation of this phenomenon is that the slowest pupils are most likely 
to be subject to attrition and/or pupils transferring into the program 
are academically behind the continuing (identic) pupils. However, since 
these posttest patterns were equivalent in FT and NFT classes, and since 
they were systematic across projects, a more likely interpretation is 
that identic pupils display a test-retest effect of approximately one 
item — on the average--o ver those pupils who migrate into the evaluation 
sample after pretesting. The results, we feel, provide at least indirect 
evidence that use of identic pupil scores to estimate the classroom means 
does no't produce subgroup bias (with the exception of the overall retest 
increment , which appears constant across al 1 subgroups ) . Th is is partic- 
ularly important, since control variable data exist only for the identic 
pupils . 

The second issue, the magnitude of missing data, was evaluated' 
by tabulating the aggregated classroom values on variable-by-variablc 
bases. This amounted to distributing classroom scores on pupil, parent, 
and teacher variables, both outcome and control. Note that aggregation 
of parent variables was determined by and tied to the corresponding ag- 
gregation of pupil variables These relative proportions of missing clas 
room level data are summarized for control and outcome variables in 
Table A-2. 



Table A-2 

PEi^CENT OF MISSING DATA FOR EACH CLASS OF VARIABLES 
FOR EACH OF THE ANALYSIS GROUPINGS 



COHORT I-K 



PUPIL 



PARENT 



TEACHER 



VARIABLE OUTCOME CONTROL OUTCOME CONTROL OUTCOME CONTROL 



1.57 

(N = 356) 



1.60 



1.50 
(N 329) 



4.80 



22.51 

(N = 253) 



22.20 



COHORT I-E 



2.40 
(N = 128) 



.74 



.91 

(N = 109) 



7.40 



2.74 
(N = 74) 



1.58 



COHORT II-K 7.79 7.79 0 5.56 0 1.34 

(N = 77) (N = 72) (N = 32) 

COHORT I I-E 16.13 11.78 0 4.17 0 1.06 

(N = 31) (N = 24) (N = 27) 
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As can be seen from Table A-2 , the relative extent of missing 
or incomplete data was not severe, except for Cohort I-K teacher variables. 
Since we felt this overall result was acceptable for the purposes of our 
interim analysis, we proceeded to impute missing classroom-level values 
according to the following algorithm: 

• Data missing at* the classroom level wa.'^ •'jiputed from the 
mean of the school at the approprikte subgroup classifi- 
cation (i.e., grade level by cohort by treatment condi t ion) . 

• If imputation within the school was unsuccessful, the next 
level of nesting — the district--was used to impute the 
missing values. 

• If imputation failed at the district level, then the over- 
all subgroup mean was used to supply missing daca values. 

It should be noted that only rarely was it necessary to proceed 
beyond the school level, and almost never beyond the district level, in 
the imputation procedure. As such, we feel the likely bias (most probably 
reduced variance and increase in Type I error rates) introduced by this 
procedure is small relative to its advantages of maintaining sufficient 
cases so that analyses could be performed. 

Part 2: Issue s in the Selection of ANCOVA as the Method 
of Analysis 

The selection of one-way analysis of covariance as the principal 
statistical method for evaluating interim FT impacts was based on con- 
siderations of logi-al appropriateness, robustness, and power and inter- 
pretability. Each of these considerations is discussed in detail in the 
following sections . Arguments for the -parallel analysis of project ond 
Sponsor treatment groupings are also discussed. Finally, considerations 
of alternative methods of analyses and of alternative techniques of co- 
variable adjustments are presented. Where possible, relevant data are 
provided to support our arguments. 

Logical Appropriateness ^ 

I n terms of logical appropriateness , the princ ipal issues were the 
structure of the treatment variables, the use of control groups, and the 
definition of replication samples. To consider variations of FT treatment 
on any other than a one-way continuum would require not only the specifi- 
cation of the additional continuums but also rather precise quantification 
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of treatment variables on each such continuum. The objective of such 
quantification would presumably be to establish a basis for organizing 
■*'.reatments into a factorial or partially crossed structure for analysis 
of main effects and interactions among "pl^rmed variations." However, 
we now have only qualitative distinctions, corresponding to sponsor 
descriptions of their respective models, and no assurance that variabil- 
ity in treatment is any loss across projects within a given sponsor than 
across projects between sponsors. Thus, although it is more convenient 
for purposes of interpretation to group projects within sponsors and, ^ 
perhaps, to cluster sponsors in terms of meta-models or theoretical con- 
structs (e.g., psychological growth versus parent involvement approaches), 
we feel it is more defensible on both rational and statistical bases to 
consider each project as a relatively independent treatment in a large 
pseudo- (or quasi-) experiment, jand to organize the collection of these 
treatments into a one-way analysis of variance design. 

The purpose of wi thin-pro j ect control groups was to provide a basis 
for separating district variance (variability in scores observed from 
one district to another) from program variance (variability in scores 
attributable to FT programs). The issue of how to handle these control 
groups within the one-way analysis design was resolved by treating them 
as nested within each project. We rejected the alternative of represent- 
ing the FT/NFT distinction as crossed with treatments, since such a treat- 
ment by levels (FT/NFT) analysis design requires that we be able to esti- 
mate treatment effects at the NFT level, which of course cannot be done. 
Also, when crossed analyses of variance design are unbalanced with respect 
to cell frequencies (as is the case with FT data), resultant hypotheses 
tests are biased in potentially complex ways, making interpretation very 
difficult. We reasoned that through use of planned comparisons (i.e., lin- 
ear contrasts of elements in the one-way design), tests of all major hy- 
potheses of interest could be accomplished while problems associated with 
the crossed design were avoided. ^ 

A second issue concerning the use of conti^ol groups in the analysis 
of covariance design pertains to the equivalance or cor oarabi lity of 
subgroups (FT/NFT) within projects on re!}.evant covariables (baseline and 
background variables). Substantial noncomparabili ty can present serious 
and perhaps unresolvable analysis problems. Inspection of those covari- 
ables in preliminary descriptive analyses did reveal moderate and occa- 
sionally substantial laCK of FT/NFT covar? -^ble comparability within pro- 
jects. (These covariable values are summarized in project data tables 
in the section on results.) These biases are viewed as a direct consequence 
of the lack of randomization in sampling and assignment to experimental 
groups in the quasi-experimental evaluation design. A variety of methods 
for dealing with this control group problem has recently been sugger^ted, 
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including the analysis of pre-post difference scores, regressed gain scores 
repeated measures, indirect standardization, treatment- effect correlation, 
and analysis of covariance (Campoell, 1971; Campbell and Erlebacher, 1970; 
Harris, 1967; Hsia, 1971; O'Connor, 1972; Porter, 1972; Wiley'*'(in press); 
WTiitla, 1968). We rejected the gain score methods because of noncom- 
parability of pre- to post-measures (tests change from kindergarten to 
second grade) and treatment-effect correlation methods because of in- 
sufficient project data points. And since indirect standard"- zat io 
[comparing groups on the basis of residual outcomes derived from multiple 
regression of the entire sampl' 's outcome scores with relevant control 
variables (see Shaycoft et al., 197'^)] is a special cr.se of classical 
analysis o:'^ covaripnce, the latter appeared more direct and appropriate 
for our analysis purposes. 

To proceed with analysis of covariance, the following assumptions 
are required: 

1. The samples v/ere drawn from a common population. 

I 

2. The subgroups (FT/NFT) are experimentally independent. 

3. Experimental "errors" are independently (and less 
importantly, normally and homogeneously) distributed. 

4 . Covar iables are uncorrelated with treatment , measured 
Wa thout error, and homogeneously distributed. 

The descriptive analyses provided some support for Assumption 1; 
specifically, FT aud NFT samples were predominantly composed of below- 
average families in terms of generally accepted SES indicators (e.g., in- 
come, education, occupation), and thus, all appeared to be drawn from the 
disadvantaged population. Also, sir-^e FT and NFT pupils are genej^ally in 
separate schoo :, Assumption 2 seems plausible. 

^Assumption ?* fundamental to the fixea effects ANCOVA model. In- 
validation of this assumption would seriously affect the validity of the 
an'ilysis, particularly the hypothesis t". sts. However, Glass et al . (1972) 
have shown chat nonindependence of errors produces far more serious con- 
sequences than does nonnormality of errors. 

In the Foll'^w Through experiment, pupils are generally grouped into 
self-contained classrooms. Since each such classroom representti a homo- 
geneous unit (common teacher , location, facilities , and interactions among 
comp'^nents) possible nonrandomness of error among pupils ..ithin classrooms 
can be argued. However, classrooms can be viewed as experimentally inde- 
pendent, and we telieve tney constitute an appropriate unit for analysis 
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in terms of Assumption 3. Additional considerations for the use of 
classi jom-level data in our analysis of effects are discussed in the |^ 
preceding section. 

Relevant to Assumption 4 is the observation that often the FT sample 
within a project appears somewhat "more disadvantaged" than its corre- 
sponding NFT *^ample. This may indicate the existence of a biased selection 
or assignment process; e.g., the poorest families are recruited^ for FT 
programs, or only the poorest schools (ad hoc) are alwarded FT grants. 
However, this bias becomes troublesome only if we postulate anticipated 
interactions among these and other control variables that will affect the 
program, thus violating Assumption 4. For example, children from extremely 
impoverished families may have suffered organic damage from extended mal- 
nuti^ition and consequently would not be expected to show any improvement, 
whereas children from moderatelj^ impoverished families^ might be very re- 
sponsive to the same program. If the poorest families were dispropor- 
tionately allocated to FT, the above described interaction would strong-ly 
affect results of FT/NFT comparisons. More subtle examples of these 
disruptive interactions occur with respect to measurement instruments in 
the form of floor and ceiling effects. 

Two general methods of dealing with this post hoc nonrandom covari- 
able interaction problem are polynomial regression and blocking. In poly- 
nomial regression, the exact nature of the suspected interaction is modeled 
and statistically controlled by means of polynomials of appropriate degrees. 
This requires farily elaborate and precise theories and a total absence of 
error in the covariates.* Although this approach is currently popular in 
econometrics, it far exceeds the current precision and robustness of theo- 
ries and measures in educational evaluation. The second alternative — 
blocking--is generally more appropriate, we believe, to our current state 
of evaluation technology. But this method imposes the same requirements • 
as the factorial design in terms of balanced cell frequencies^'. . ' Further- 
more, it assumes blocks were defined Wnd units were randomly Assigned 
across conditions within blocJks prior to the administration of the treat- 
ment. Since the FT data satisfy none of these conditions, we rejected 
this nethod. Due to regional and sampling variability, there are virtu- 
ally no control variables for which blocking strata of critical^or theo- 
retical interest could be established to achieve a balanced design. Even 
if such strata could be established, they would be post hoc, which would 



• ♦ 

* 

The absence of error assumption is required fo^* litei'ally any adjust- 
ment on control variables. 
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seriously affect probability statements on any hypothesis tests. Con- 
sequently, we implemented standard classical analysis of covariance to 
reduce experimental error. 

To safeguard our analyses from obvious violation of Assumption 4, 
we performed a series of pre-liminary screening analyses on potential 
covariables by examining overall covariable-dependent variable corre- 
lation matrices and by examining individual bivariate scatter plots of 
covariables with treatment and dependent variables. The purpose of this 
procedure 1^ was to ensure that covariables were both uncorrelated with 
treatment (FT/NFT) and were reasonably and linearly coi^related with out- 
comes. ^Although this procedure does not ensure wi thin-subgroup homo- 
geneity of covariables (as we shall demonstrate later), the within-cell 
frequencies were too small to enable a test of this assumption. But 
Glass et al. (1972) reported, "...the empirical sampling distribution of 
the F-statistics differed little from the theoretical sampling distri- 
bution unless the departure from homogeneous slopes was extreme" (p. 277). 
Thus, the procedure does provide at least limited protection against these 
violations of the model (covariable interactions an^^ correlations with 
treatment variables). 

The remaining logical issue in the choice of analysis procedures was 
the definition of replication samples. We rejected a single analysis 
pooled over all data because of known and assumed systematic inequalities 
on both dependent and independent variables among the subsamples. For 
example, the EF samples were systematically one year older than the cor- 
responding K samples within coh<s>rts. This meant tjhe subgroups were non- 
comparable in terms of Jjoth underlying developmental/maturat ional vari-- 
ables and test battery content. Si^milarly, we rejected pooling across 
cohorts within age levels, since both treatment and test variables under- 
went considerable transformation from one year to the next. Consequently, 
we reasoned that the most appropriate analysis would be one that is per- 
formed separately on each of four independent replication samples: CI-K, 
CI-EF, CII-K, CII-EF. 

Witl^n these replication samples, several additional analysis alter- 
natives were considered. One was that of conducting separate ANCOVAS on 
pro ject-by-pro^ect bases. Another was conducting repeated-measures ANCOVAS 
on Cohort I first and second year data. Both of these alternatives were 
rejected for reasons described below. Project-by-project analyses were 
considered because sponsors are not well distributed across school dis- 
tricts — or, for that matter, across geographic regions. IndecJ, nearly 
a one-to-one correspondence exists between project and district, and, 
with few exceptions, district variations are almost totally nested within 
sponsors. Since there was sample evidence of a strong district variance 
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component, the danger of serious confounding of district and sponsor 
sources of variance exists. Although within-district control groups 
(the FT/NFT dimension) provide a basis of separating these sources of 
variance, performing separate analyses on a project-by-project basis 
within each replication sample would be even better. The problem with 
this method is that when classroom-level variables are used as analysis 
units, we simply do not have enough observations to perform control vari- 
able adjustments and to estimate effects for separate project-by-project 
analyses. Just as there are insufficient degrees of freedom to test the 
covariance matrices at the project level for assumptions of the ANCOVA 
model, there also are too few degrees of freedom to perform separate 
ANCOVAS (or even ANOVAS) on a project-by-project basis. On the other 
hand, combining projects within cohorts and grade streams into separate 
analyses dcjes develop sufficient observations or degrees of freedom for 
hypothesis tests. Use of project descriptor variables as covariates for 
the interproject variance component enables unbiased estimation of FT/NFT 
effects, which, through use of planned comparisons, can be tested on a 
project-by-project basis. In summary, if one is willing to assume that 
the project descriptors appropriately index the interproject. \ ariance 
component, then the larger (cohort-level) analyses on a set of independent 
projects should yield comparable results to those of separate (and un- 
feasible) project-by-project analyses. 

The consideration of repbated-measures analyses across the one and 
two year outcomes of Cohort projects was based on concern for even 
further reducH;ion in error variance through use of correlated properties 
In these data. However, we rejected this analysis method for the follow- 
ing reasons: ^ 

• The one year sample was only a subset of tl^e two year sample. 

• Different measures were obtained across the years. 

• Classroom compositions were noncomparable from' year to year. 

As such, separate analyses were performed on the Cohort I one year 
(Spring 1970) and two year (Spring 1971) data, as well as the Cohort II 
one year (Spring 1971) data. Since data for separate grade streams (K and 
EF) were analyzed sep^arately, a tota^ of six separate data matrices were 
analyzed. 



Robustness and Power 

Two factors of considerable concern in selecting our analysis ruethod 
were robustness and power. Robustness is the extent to which the results 
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of our hypothesis tests are affected by departures in our data from as- 
sumptions in the statistical model employed. Power is the extent to which 
our procedures lead to valid rejection of null hypotheses. 



Robustness 

The three basic assumptions^ underlying the fixed effects model 
one-way analysis of variance are: 

• independence of within-cell error components 

• Noruality of within-cell error components 

• Homogeneity, between cells, of within-cell error 
variances . 

For the analysis covariance (the model used in the present 
study) , the above three assumptions hold plus a fourth: 

• Homogeneity, between cells, of within-cell dependent- 
variable/covariable (s) regression . 

Certainly, Assumption 1 is critical for a valid analysis, and 
the rationale (presented earlier) underlying the choice of the classroom 
as the unit of statistical analysis is relevant here. Since the individ- 
ual teacher is likely to be the most potent factor (aside from experimental 
treatment variation) in studies of the present type , "^-.considerable lack 
of independence could be expected by treating the pupil as the unit of 
analysis. One could argue for aggregating to an even larger unit, e.g., 
the school or even the district. However, as has been noted earlier, 
serious problems regarding degrees of freedom could be expected and it 
is debatable whether the decrease in potential nonindependence would make 
the effort worthwhile. 

The effects uf violations of Assumption 2 on Type I error rates 
and power when the n's are equal or unequal are well known (see Glass et 
al,, 1972). Assumptions 3 and 4 have been shown to be relatively unimpor- 
tant at the practical level (i.e., they can be violated with little effect) 
as long as the cell n's are equal. Otherwise, the combined effects of heter 
ogeneous n's and a's are unpredictable and can be substantial. Again, 
using the analyses of FT versus NFT based on orthogonal contrasts as an 
example for the present study , the n*s were unequal. 
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IssueS'of Statistical Power 



Power, in its most general context, refers to the overall prob- 
ability of rejecting false null hypotheses. In this context, the well- 
known power determinants are overall sample size and number of sources of 
systematic variation in the experiement. For the latter, techniques such 
as "blocking^' on concomitant variables or covarying these covariables 
constitute the best known means of increasing power. 

At a more specific level, the issue is the exact power (in terms 

of a probability statement) for a specified alternative hypothesis. For 

such a computation, two parameters must be estimated either by a priori 

or by empirical means . The first of th ese is the magnitude of treatment 

effect, T-i , for which the researcher feels he must reject H . Put an- 

o 

other way, the investigator must fix the magnitude of the treatment 
effects so that if, in fact, effects of this magnitude exist in the pop- 
ulation, the experiment has a high probability of detecting this state 
of affairs. The other parameter for which some estimate must be made 

is the unsystematic or unspecified (error) variance of the dependent 

2 

variable in the population, a^. Consider the case where the treatment 
effects that were deemed by the investigator to be large enough to require 
detection are, in fact, present in the population. If the experiment 
involved n subjects per cell (assuming a one-way layout with J treatment 
groups), then the test statistic F = ^^S^^^^^^^^^^/MS^^^^^ is distributed 
not as a central F variate with (J - 1) and J(n - 1) df, but rather as 
a displaced or noncentral F, with df = (J - 1) and J(n - 1) and the mean 
displaced approximately by the value X/ (J - 1), where \, the noncentrality 
parameter is given by 

J 2 



X = 



2 

ae 



At the procedural level, the value $ is obtained by 
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Although it is generally understood that increasing N, which is equal to 

J 

j=i 

can be expected to increase the overall power of an experiment, this 

fact is--«loosely speaking — "truest" if J, the number of treatment groups, 

remains constant . For example , i f we had an experiment with two groups , 

2 

10 subjects per group, a = 100, and T. = 4, § in the above formula, 

G J 

would be 1 . 26 . If = .05, the power of this experiment would be approx- 
imately ;40. On the other hand, if we redistributed the 20 subjects into 

9 

four groups, leaving , CJ"^ , and ^ the same,$ would be .89, and the power 

J e 

would be about . 25 . To leave 10 subjects per cell and have four treat- 
ment groups would require doubling the sample size, and if we left the 
values T j , a^, and ^ as before, we would have $ = 1.26 again, but this 
time the power would approximately equal to .49. 

The preceding example reinforces the idea that power is more 
than simply a function of N. Needless to say, the investigator is wise 
to divide his total N into as few groups as empirically and intuitively 
possible. 

In the Follow Through study, analyses were often performed on 
large numbers of classrooms (the sampling unit) and large number of pro- 
jects (an independent variable). Also, to some extent, the locus of 
greatest interest involves the individual FT versus NFT comparisons within 
projects. These comparisons are often based on means having a small number 
of observations, despite the overall large N in some cases. 

For this reason , we consider it important to assess power along 

with some of the dependent variables. Two complicating factors arose in 

this power analysis. First, a large number of FT versus NFT comparisons 

were made on the same dependent variable, whereas what was really desired 

was an overall assessment of power. Secondly, each comparison was based 

on different cell n 's and also different adjusted (by the covariables) 
2 

estimates of a^. Thus, to be able to arrive at an overall--albeit not 

completely precise — estimate of the power of, for example, the planned 

contrasts using the Cohort I-K sample and the achievement-dependent 

2 

variable, pooling of n's (see Cohen, 1969) and ^^'s and averaging were 

employed. More specifically, if we have 2J cells of nj observations each 

(J projects for both FT and NFT), and adjusted variance errors of com- 
2 

pari sons a,, we may define 
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n - 



2J 

(1/2J) T n. 



j = l 



- 2 

We can then estimate a common adjusted a term as 

e 



- 2 - 2 / n \ 

% = ^d lij 

We then employ the statistic $, given in this case by 




The question of "average power" for the paired (FT versus NFT) 
comparisons, using one sample and one dependent variable at a time, was 
addressed by entering the formula with various values of T . . For each 
of the 5T5amples and dependent variables examined, one computation was 
made in terms of the average (over the J projects) absolute value of 
treatment effects obtained between FT and NFT groups. Other computa- 
tions were made in terms of treatment effects corresponding to ,5, .75, 
and 1.0 adjusted standard deviations of overall dependent-variable scores. 
The results are displayed in Table A-3. 



It is often difficult for the investigator to specify the Tj 
values, in making power computations . One criterion often used is one 
standard deviation (SD) in the distribution of dependent yari^-bles. 
Note that ^ Tj of 1 SD for two groups (as is the case in the FT versus 
NFT comparisons) implies a difference between treatment means of 2 SDs. 
This difference might be considered too large to constitute a minimum 
difference to be detected. Thus, the values in Table A-3 for Tj values 
of . 5 SD (and, therefore, a difference between means of 1 SD) may be the 
best ones upon which to focus. For these values, the average comparison 
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. TABLE A-3 

AVERAGE POWER VALUES FOR PLANNED COMPARISONS WITH 
VARIOUS DEPENDENT VARIABLES 



AVERAGE TREATMENT 



DEPENDENT VARIABLE 




EFFECT 


.5 


SD 


.75 SD 


1 SI 


ACH IE VEMxiiN 1 
COHORT' I -K 

N = 330, J = 28 




.57 




.30 


.57 


.82 


COHORT I-EF 

N = 123, J = 11 




,20 




.24 


.40 


.69 


IF N = 176 (J = 


11) 


,23 




.31 


.59 


.83 


IF N = 220 (J = 


11) 


,27 




.38 


.69 


.90 


COHORT II-K 

N = 51, J =: 8 




,23 




.10 


.17 • 


.28 


IF N = 160 (J = 


8) 


,58 




.25 


.48 


.72 


COHORT I I-EF 
N = 31, J = 4 




,28 


< 


.10 


.10 


.15 


IF N = 120 (J = 


4) 


,86 




.17 


.29 


.48 


AFFECT 

COHORT I-EF 

N = 123, J = 11 




,20 




.24 


.40 


.69 


IF N = 220 (J = 


11) 


,27 




.37 


,69 


.90 


COHORT I I-EF 
N = 31, J = 4 




,34 


< 


.10 


.10 


.15 


IF N = 120 (J = 


4) 


.93 




.17 


.29 


.47 
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between FT and NFT means appears to have somewhat lower power than would 
be considered optimal on both the achievement and affective vax^iables. 
This fact is particularly true in the case of the Cohort II data — both 
K and EF , One must remember that the power values are averages, however, 
and that they arc somewhat crude averages at that. The power for a 
specific within-pro ject comparison — between FT and NFT — on these dependent 
variables may be greater than the tabled values, although some comparisons 
will also have lower power. 

^ In general, the effects of increasing overall sample size 

while leaving the number of treatment groups unchanged resulted in power 
increases that could perhaps be considered not worth the effort. For 
example, almost dovibling the Cohort I-EF sample to 220 classrooms would 
increase the average probability of detecting a (real) difference be- 
tween means of .5 SD from .24 to only ,38. 

To discover the effects of redistributing the sample classrooms 
into a smaller nvimber of treatment groups, the 330 Cohort I-K sample 
classrooms were hypothetically spread over 56 projects by treatment (FT 
versus NFT) cells. Had there been only 20 such cells (10 projects, as 
opposed to 28) to which the 330 classrooms were assigned, the power for 
detecting a .5-SD effect, on the average, would have increased from ,30 
to .67. For detecting a .75-SD effect, the power would have increased 
from .57 to .95, and fpr 1.0 SD , from ,82 to, for all intents and pur- 
poses, 1.0. 

Although only a small number of pupil-dependent variables were 
employed in the assessment of power, the general findings can be expected 
to be general izable to the other variables analyzed. Table A-3 shows 
that the power values for a given sample on the achievement-dependent 
variable were almost identical to those on the affective-dependent var- 
iable in the cases in which both were examined. This correspondence can 
be expected to hold throughout the other analyses. 

The implications of the preceding discussion are twofold. First, 
the fact that more significant diffe^.ences were not found in the FT versus 
NFT comparisons must be tempered somewhat by the fact that real differences 
of a magnitude most would consider worth reporting may have, in some cases, 
gone undetected becaus'^ of the relatively low power in many of the analyses. 
Secondly, it would appear that a case could be made — at least from the 
evaluation point of view — of implementing the "planned variation'* concept 
with fewer 'Variations^' and a more substantial data base for each. This 
latter not' on would appear to be reasonably consonant with the view of 
Follow Through as an experiment . 
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Interpretabil ity 



The third factor of concern in our selection of the analysis method 
was the interpre tabili ty of results. In this context, our one-way anal- 
ysis of covariance is considered appropriate for a number of reasons. 
First, for each replication sample (CI-K, CI-EF, CII-K, CII-EF) , the 
method provides unbiased estimates of treatment effects for each 
treatment/control combination entered. These effects can be directly 
interpreted in terms of their absolute or relative magnitudes and fur- 
ther evaluated against an error term yielding probability statements 
concerning hypotheses tests. 

Second, the method enables the development and testing of literally 
any hypothesis of interest within each replication sample — at a known 
confidence (alpha) level. Moreover, through use of Bonferroni or Fisher 
techniques of constructing joint confidence intervals, post hoc hypoth- 
eses (comparisons or groupings of interes t-thAt may emerge after the 
data are analyzed) can be tested. 

Third, since each analysis produces unbiased estimates of program pf- 
f ects , these effects can be compared across replication samples for the ap 
propriate subgroups. This provides a means for, say, examining first year 
effects for a given sponsor on Cohort I versus Cohort II data and hence, 
for evaluating improved implementation. Similarly, second year effects fo 
Cohort I can be contrasted with first year effects for either Cohort I or 
CohoT^t II. An^ with a substantial amount of difficulty, hypothesis tests 
can be constructed for these interanalys is comparisons. 

To clarify the analysis flexibility afforded by the one-way fixed 
effects ANCOVA, consider the following example. Cohort I-K pupil data 
consists of 330 classrooms distributed across 28 projects, for which 
two-year effects are analyzable. ' Since each project contains two treat- 
ment groups — FT and NFT — there are a total of 56 cells in the one-way 
design for this data set; and the overall or omnibus F test for treatment 
effects is on 55 df. Since we are interested in estimating effects and 




Such tests would require appropriate combination of estimated effect 
and error components from ^ the separate analyses, and they would be 
based on the assumption that these components are independent across 
analyses , 

For a project to be analyzable, classroom data must be available for 
a{ least two FT and two NFT classrooms. 
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testing hypotheses concerning FT outcomes, this omnibus test is not of 
interest. Rather, through use of linear contrasts, 26 separate and 
orthogonal (mutually exclusive) tests corresponding to FT versus NFT groups 
within each project are of interest , and each of these tes ts is based on 
a single degree of freedom. Other linear contrasts, also with each on 
1 df, might be used to simultaneously contrast all projects within spon- 
sors or groups of sponsors. Such contrasts would not, however, be orthog- 
onal to the project-level contrasts. In short, through use of properly 
coi^J^ucted contrasts each having the property Ecj^ = 0, where each Cj^ is 
a coefticient (usually il,0) by which level of treatment is multiplied, 
all possible comparisons of interest can be generated and tested at known 
(or estimated) confidence limits. Since treatment effects estimated by 
this fixed effects model are assumed unbiased, they can presumably be 
compared across analyses. This means that the estimated FT effect for a 
Cohort I project can be directly compared to the estimated FT effect for 
that (or any other) project in, say. Cohort II. And since these estimates 
are based on independent samples, the test statistic for such a cross- 
cohort comparison would be : 

a - a 
1 2 



1 2 



N + N - 2 
1 2 



where 



a. = project effects, 



SE. standard error'of project effects. 



and N = size of project sample. 



This is the familiar t-test for independent samples. 




We stated above that the one-way fixed effects ANCOVA enables testing 
all such outcome comparisons ol interest at known (or estimated) confidence 
limits. In the actual analyses of these interim data, only tests on pro- 
gram effects (FT versus NFT) at the project and sponsor level were performed, 
and those were orthogonal within each analysis. (The next subsection dis- 
cusses reasons for performing separate analyses on project- and sponsor- 
level groupings.) Although strictly speaking, project and sponsor tests 
are nonor thogonal to each other, they are performed on slightly different 
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data sets and thus are partially independent. This overlap in analysis 
is of concern vr-ly to the extent that it affects the width of the confi- 
dence interval that we construct to evaluate significance of results. 
For e;xainple , if we assume that sponsor- and project- level analyses are 
mutually exclusive, our confidence intervals for orthogonal tests within 
these analyses will be a function of the alpha value selected for sig- 
nificance testing. If we wish to take into account the nonindependence 
of the two levels of analysis, this confidence interval will have to be 
expanded in terms of a presumed joint ^probability distribution of the 
form Op = 1 - (1 - a)", where n is the number of sponsor and project tests 
common to the same data subsets. 

This issue can perhaps best be clarified by specific example. The 
Cohort I-X two-year project analysis involves 28 projects (330 classrooms), 
while the sponsor analysis involves 12 sponsors (356 classrooms). Hence, 
there are an average of slightly more than two projects per sponsor. 
Project-level tests are independent, and so are sponsor tests. But, on 
the average, three tests per sponsor are performed: two at the project 
level, and one at the sponsor level. To maintain a .95 confidence inter- 
val for any test at either level, each individual test should be performed 
at the (1 - .983) or the .017 level. 



On the other hand, we could argue that the project level is the 
most appropriate and that, on the average, 1-1/2 tests are performed 
for each project. This suggests that the appropriate confidence inter- 
val for each individual test would be at the .965 (a = .035) level. 
Thus, it appears that the .95 confidence interval for individual tests — 
both project and sponsor level — will be biased toward Type I errors, so 
th'\t depending upon how one wishes to interpret the situation, the true 
alpha will be somewhere between .14 and .07. We believe that such bias 
is acceptable and possibly desirable in terms of offsetting the Type II 
bias because of lack of analysis power. 

The actual confidence intervals for interpreting the significance 
of each test are obtained by using the formula 



95 Percent Confidence Interval =±1.96 X Standard Error of 

Contrast.* 



* 

This corresponds to the general expression 



X - SE(Z ) ^ ^ X + SE(Z ) 
l/2a l/2a 
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Separate confidence intervals are calculated for each comparison. These 
confidence intervals are then combined (added and subtracted) to the 
corresponding estimated treatment effects. They provide a convenient 
and direct method of evaluating the significance of individual results. 
One method of reading the confidence intervals is: "We are .95 confident 
that the true FT effect for ( project or sponsor comparison ) is at least ^ 
( lower interval estimate ) and as much as ( upper i nterval es timate ) . " 
If the confidence interval crosses zero, we conclude nonsignif icance^ 

Arguments for Parallel Analysis at Project and Sponsor Levels 
of Treatment 

We have indicated that separate ANCo'vAS were performed on project- 
arid sponsor-Xe vel groupings of the data. The decision to conduct such 
parallel analyses was based on several considerations, including number 
of observations, assumptions regarding district-level variance, and 
evaluation objectives. These considerations are discussed in the follow- 
ing paragraphs. 

With regard to number of observations, one of the most serious 
weaknesses of this interim evaluation is the lack of power in statis- 
tical tests, primarily due to.^limited observations. This problem is 
aggravated when project- level groupings of the data reveal that often 
only a single control classroom is available for given projects, thus 
excluding these data from the project-level analyses. But if observa- 
tions within cohorts are grouped at the sponsor level such that the 
sponsor defines the treatment variable, then all available and valid 
data can be included. To reduce the impact of the pooled within-sponsor 
district variance on estimation of treatment effects, appropriate 
district-level covariables are obtained from the project descriptors 
and incorporated in the ANCOVAS . With the exception of Cohort II-E data, 
these sponsor-level analyses result in fewer treatment groups and more 
observations, yielding notable increases in degrees of freedom for error 
variance . 

The effect of pooling district variance within a spou^or is likely 
to obscure, to some extent, estimates of variance due to treatment. We 
considered statistically correcting the data, but solutions involving 
least-square adjustments or corrections require that we model the assumed 
district effects as constant within and across sponsors, which is prob- 
ably false. This means that hypothesis tests regarding FT/NFT outcomes 
for sponsor-level analyses will most likely be too conservative. On the 
other hand, estimates of FT/NFT effects should not be biased by district 
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effect. And sponsor-level tests showing significance should be quite 
general izable , provided a reasonable number of projects comprise the 
test. ■ 

Finally, one goal of the evaluation is tp identify the programs 
that produce measurable impacts on a natior.al ' level . Sponsor analyses 
include all currently available evidence of such impacts, whereas, as 
noted above, project-level an-^lyses exclude many observations because 
of variance es timation problems . Hence , the sponsor analyses can , in a 
sense, be considered more appropriate to the overall evaluation objec- 
tives but less appropr: ate in terms of detection of significant interim 
program effects. 



Consideration of Alternative Methods of Analysis 

In the course of developing our final method of analysis, we explored 
several alternative methods, each designed to deal with a major difficulty 
encountered in the interim data. One such alternative addressed the prob- 
lem of FT/NFT comparability, another dealt with problems of degrees of 
freedom (limited observations) ,■ and a tnird set of alternatives were 
considered for techniques of covariable adjustment and bias reduction. 
Since a detailed presentation of b<:r research into these alternatives 
is beyond the scope of this report, we will present only a brief dis- 
cussion of .each. 



Problems in Control Group Matches , 

As previously indicated, preliminary inspection of the interim 
dat^ suggested moderate and occasionally severe noncomp^^rability of 
trea^ent and control pupils on many demographic and experiential vari- 
able^T^t As mi|?hfc be expected^ this noncomparability was most severe at 
the population extremes; i.e., the FT samples tended tc be more disadvan- 
taged than the NFT samples. This problem became particularly acute when, 
in a preliminary analysir, (SRI, 1972a) we restricted our observations 
to just those FT and NFT pupils whose families met the OEO poverty guide- 
lires (about CO percent of the data). This restriction resulted in a 
disproportionate representation of FT pupils in the sample^and occasionally 
nearly totally excluded the in-district controls. To develop sufficient 
data for analysis, we reasoned that careful matching on concomitants of 
district variance on a pupil-to-pupil basis should effectively account 
for the district effects, thus enabling us to pool control pupils across 
districts and to implement a post hoc matched-pairs analysis for^^FT 
project? on a pro ject-by-project basis. This matching involved arranging 
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alfL "eligible" (i.e., moeting OEO poverty guide 1 iues ) NFT pupils within 
each cohort on ^he selected matching variables (preschool experience, 
ethnicity, sex, parent education, and parent-child interaction score). 
Then, for each "eligible" FT pupil within the project, a NFT "match" was 
drawn with respect to these five matching variables. This matching was 
constructed independently from project to project, with replacements 
across but not within projects. 

This px'ocedure had the Advantages of increasing both the power 
and precision of the ahaiyfiis (as well as enabling an analysis in the first 
place) and of providing estimates of FT effects at the child level. The 
disadvantages of this approach included the use of post hoc matches, the 
nonindependence of units across projects, and the lack of attention to 
cla^ss room- level effects. Currently, we feel this method might be useful 
for detection and analysis of subtle or complex interactions at the child 
level (i.e., aptitude by treatment interactions) but that the approach is 
inappropriate for assessment of overall program effect (i.e., it lacks 
the necessary gener alizability for evaluation of a national program). 

On the other hand, if national or regional level effects can 
be established, the next step might wel" b'^ that of analyzing for dif- 
ferential effects on the individual level. The thrust of such an approach 
would most likely involve the identification of pattern?^ of results for 
purposes of individual diagnoses and prescription This implies we could 
precl:.ely define antecedents, treatments, and consequents at the level of 
the individual child, a capability that we are currently attempting tc 
develop (but that is not present i:i these interim data). 



For the overall problem of noncomparability of control and 
treatment groups on the classroom-level bases (i.e., our curi'ent a.ialyses), 
there does not appear to be a convenient solution, ^^wever, the impact of 
the problem is lessened by recalling that our model assumes classroom var- 
iance dominates other non treatment sources of variance within the district. 
Since the NFT samples do (Control for district variance, and since reason- 
able efforts are made to match KT/NFT classrooms, our analysis method 
should appropriately and validly detect the treatment variance. 
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Limited Degrees of Freedom 



Another difficr.lty encountered in the current analysis is the 
limited observations when classroom-level units are used. The direct 
implications of this problem have been discussed both in terms of assump- 
tions of the covariance model and of effects on the power of the analyses. 
An alternative analysis procedure, which was ^thought might produce some 
savings in degrees of freedom lost to covariables, was explored. This 
procedure is described as indirect standardization, and it is presented 
in detail in a recent paper by David Wiley (in press). The following 
excerpt is particularly relevant: 

In indirect standardization, instead of applying a set 
of reference proportions to the subgroup means for each 
group, we calculate the subgroup means for the whole 
group anl^ then use the subgroup proportions in each 
comparison group to produce a predicted value for each 
subgroup . Thes 2 predicted values are a f orecas t of 
the values which would result if there were no dif- 
ferences between the comparison groups except those 
generated by the unequal performance of the subgroups 
and the unequal distribution of subgroups in the com- 
parison groups. These values may be used to adjust the 
original comparison group means^ by estimating the bias 
due to the unequal distribution and eliminating it 
(pages 9-10) . 

For data sets composed of relatively few observations, this 
procedure appears to offer the advantage that a la^ge number of concom- 
itants or covariables could be used, via conventional regresj^icn tech- 
niques, to produce a single composite — or indirectly standardized--con- 
trol variable. This resultant control ^^ariable employs a single degree 
of freedom in ANCOVA, whereas multiple covariates would correspondingly 
use multiple degrees of freedom. 

Several important analysis issues must be resolved before the 
indirect standardization approach can be generally adopted. First, the 
approach subsumes all assumptions of conventional ANCOVA in addition to 
those involved in the standardization procedure.. Second, the approach 
is more likely to operate on bias than on error variance. This argues in 
favor of th' procedure, since it is bias that is most evident ^in our de- 
scriptive analysis of within-project control groups. Third, it is not 
intuitively clear that the same probability distribution (i.e., the F 
distribution) used for hypothesis tests for conventional ANCOVA would be 
appropriate for indirect standardized ANCOVA tests. 
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To evaluate empirically the relative consequences of proceeding 
with conventional ANCOVA , as compared with indirectly stahdardiz;^ed ANCOVA , 
we applied both procedures to a subset of data in this interim analysis. 
The specific subset was the parent-lev il outcome analyses, where it is 
believed that the estimation problems were most variable from project to 
project and where missing data problems were moderate. The results of these 
two procedures are summarized in Tables A-4 to A-7. Using conventional 
ANCOVA as the standard, the results of indirect standardized ANCOVA appear 
to conform to the above prediction; namely, indirect standardization pro- 
duces a substantial shift toward FT-favoring results (i.e., it produces 
greater adjustment for NFT bias) , whereas the conventional ANCOVA procedure 
displays more results as significant. Hence, for these data, conventional 
ANCOVA appears to optimize on error variance reduction, whereas indirect 
standardized ANCOVA optimizes on bias reduction. 



Alternate Techniques for Covariable Adjustment 
and Bias Redutetion 

Two alternate techniques suggestec^ for covariable adjustment 
and bias reduction in the analysis of these interim data were: 

• Correction of covariable weights for unreliability 
(Porter, 1967) . 

• Estimation oi FT effects by deviation from sub- 
group (project or district) means, as opposed to 
grand (cohort) means. 

The basis for the first alternative is well described elsewhere (Porter, 
1967, 1972; Glass et al., 1972) and is diseased here only for purposes 
of complete. ass . Specifically, under the assumption that many covariable 
measures represent fallible data, an adjustment in the covariable re- 
gression coefficient (i.e., the "beta weights") can be made to reflect 
the expected value if measurement were error-free. This adjustment is 
essentially equal to the proportional difference of the reliability of 



* 

Note that another important distinction is that, for these analyses, 
ANCOVA is performed on unweighted means, whereas the indirect stan- 
dardization procedure was based on pupil data that would have pro- 
duced weighted-means predicted scores; hence, the selective sensitivity 
to bias. 
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T/VBLE A-4 



/\D JUSTED P.-^RENT OUTCOMES OBTAINED BY MEANS OF .VNCOVA 
VERSUS INDIRECT STAND/\RDIZATI0N PROCEDURES: 
COHORT I, KINDERGARTEN 

CHILD 





PARENT/CHILD 


PARENT/SCHOOL 


ACADEMIC 




SENSE 


OF 




INTERACTION 


INVOLVEMENT 


EXPECTATIONS 




CONTROL 






IND, 




IND . 




IND. 






IND. 


PROJECT 


ANCOVA 


STD. 


ANCOVA 


STD . 


ANCOVA 


STD. 


ANCOVA 


STD. 




.37 


.18 


.35 


.27 


.28 


.38 




.64 


.61 




.04 


-.02 ^ 


-.18 


-.24 


.02 


. 12 


_ 


.36 


-.29 


OO \Q. ) 


.18 , 


-.08 


.00 


.06 


.16 


.76 




.43 


.48 


k5k5 \G ) 


.06 


-.r2 


-.42 


-.03 


- .34 


.13 




.51 


-.23 


r w \. 


.24 


.14 


.09 


.45 


- .53 


-.40 


-1 


.12* 


-.83* 


A »i v u y 


- . 14 


-.18 


.72* 


.70* 


. 11 


- .29 




.04 


-.16 


TTU/ ( n\ 

r W V.C ^ 


.28 


- .23 


-.09 


.15 


-.80 


-.34 




.11 


.40 




.14 


.09 


.36 


.54 


.19 


.63 




.01 


.18 


1 T A f rl 

Rr ^ n 


.45 


.25 


70* 


.72* 


.59 


.64 


- 


.45 
— 


-.40 


i>v^ vay 
BC(b) 


-.16 


- .23 


.66 


.52 


.18 


-.07 




.28 


.21 


BC(c) 


.40 


,.20 


.74 


.76 


-.36 


.22 




.43 


.52 


BC(e) 


.59* 


.31 


.33 


.70* 


- .08 


.28 




.28 


-.03 


UO(a) 


-.19 


-.32 


.30 


.39 


-.56 


.41 




.16 


.33 


UO(b) 


.50 


.56 


.30 


.68 


-.02 


-.00 




.02 


.48 


UO(c) 


-.03 


-.04 


.16 


.36 


.42 


.84 




.25 


.00 


UK (a) 


.34 


-.07 


. 11 


.23 


.21 


.55 




.26 


-.03 


UK(b) 


.16 


-.17 


.65 


. 52 


-.45 


-.79 




.22 


.25 


UK(c) 


.49 


-.46 


.02 


.14 


- .13 


-.18 




.37 


.38 


HS(c) 


-.03 


-.01 


.05 


-.15 


-.40 


-.47 




.13 


-.16 


UF(a) 


- .12 


-.09 


.08 


.00 


.14 


.06 




.14 


.12 


UF(c) 


.44 


.07 


.50 


.37 


-.09 


-,51 




.06 


-.15 


ED(b) 


-.21 


-.41 


-.22 


-.32 


.20 


.02 




.20 


.13 


ED(c) 




















NY(a) 


-.01 


-.18 


.17^ 


.45 


-.39 


-.15 




.34 


.49* 


NY(b) 


-.14 


-.ll- 


.29 


.36 


-.33 


-.25 




.34 


.36 


SW 


.08 


-.14 


-.32 


-.15 


-.07 


.36 




.18 


.32 


PI 


.07 


.19 


.31 


.25 


.53 


.39 




.61 


-.64 


OVERALL 


.13 


-.03 


.22 


.30 


-.06 


.09 




.01 


.09 


DIFFERENCE 


.16 




-.08 




- .15 






.10 





(COV.-IHD, CTD,) 



* 

< .05. 
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TABLE A- 5 



ADJUSTED PARENT OUTCOMES OBTAINED BY ME.ANS OF" AN GOV A 
VERSUS INDIRECT STANDARDIZATION PROCEDURES: 
COHORT I, ENTERING FIRST 





PARENT/ 


PARENT/ 




CHILD 








CHILD 




SCHOOL 




ACADEMIC 


SENSE 


OF 




INTERACTION 


INVOLVEMENT 


EXPECTATIONS 


CONTROL 






IND. 


IND. 




IND. 




INT), 


PROJECT 


COV . 


STD. 


COV. SID. 


COV. 


STD. 


COV. 


STD. 


SS(a) 


.72* 


.42 


.83* 


.97*^ 


.26 


.44 


.18 


-.10 


UA(b) 


.28 


.10 


.52 


.68* 


-.44 


.31 


-.20 


- . 00"^ 


UA(c) 


-.39 


-.28 


-.07 


.47 


-.02 


.65 


.87* 


1.28*^ 


BC(d) 


.18 


.09 


-.08 


.42 


- . 93* 


-.06 


.42 


.21 


UG 


.68 


.54 


.06 


,58 


-.58 


.16 


.29 




UO(d) 


-.04 


.26 


-.31 


,41 


-.34 


^-^72* 


-.10 


.34 


UO(e) 


















HS(a) 


.78 


.33 


1.40* 


,87* 


.81* 


.05 


.12 


.24 


HS(b) 


.28* 


.13 


-.35 


.31 


-.36 


.12 


- . 79* 


-.47 


UF(b) 


-.24 


-.27 


.76* 


,92* 


-.11 


-.12 


-.34 


-.11 


ED (a) 


.29 


.23 


.81* 


,68* 


-.22 


-.12 


.31 


.45 


0\nERALL 


.25 


.16 


.36 


63 


-.19 


.22 


.08 


.24 


DIFFERENCE 


.09 




-.27 




-.41 




-.16 




(COV.-IND. 


STD. ) 

















* 

^<.05. 
<.001. 



TABLE A-6 



ADJUSTED PARENT OUTCOMES OBTAINED BY MEANS OF ANCOVA 
VERSUS INDIRECT S^AND.ARDiZATION PROCEDURES: 
COHORT II, KINDERGARTEN 



PARENT/ PARENT/ CHILD 

CHILD SCHOOL ACADEMIC SENSE OF 

INTERAC" ION INVOLVEMENT EXPECTATIONS CONTROL 



PROJECT 


GOV. 


IND. 
STD. 


COV. 


IND, 
STD. 


COV. 


IND. 
STD. 


COV. 


IND. 
STD. 


FW(a) 


-.18 


.06 


.64 


.33 


-.03 


.40 


-.19 


-.11 


FW(b) 


* 

.79 


,75 


.71 


.24 


.12 




.41 


.37 


FW(c) 


-.05 


.21 


-.62 


- .43 


-.54 


-.33 


.22 


.47 


BC(c) 


.15 


.23 


.03 


-.14 


.12 


-.35 


.27 


.21 


UO(a) 


1.03* 


.80 


.59 


.39 


.35 


.17 


-.53 


-.23 


UF(a) 


.06 


.10 


.35 


.26 


.19 


. 12 


.38 


' .40 


ED(b) 


















NY(a) 


.30 


.27 


.26 


.01 


-.29 


.06 


.36 


.38 


OVERALL 


.30 


.35 


.28 


.09 


-rOl 


.09 


.13 


.21 


DIFFERENCE 


-.05 




.19 




-.10 




-.08 





(COV. -IND. STD.) 



<.05. 

TABLE A- 7 

ADJUSTED PARENT OUTCOMES OBTAINED BY MEANS OF ANCOVA 
VERSUS INDIRECT STANDARDIZATION PROCEDURES:, 
COHORT II, ENTERING FIRST 



erJc 



PARENT/ 
CHILD 
INTERACTION 



PARENT/ 
SCHOOL 
INVOLVEMENT 



CHILD 
ACADEMIC 
EXPECTATIONS 



SENSE OF 
CONTROL 



(COV. -IND. STD.) 



A-33 



PROJECT 


COV. 


IND. 
STD. 


COV, 


IND. 
STD. 


COV. 


IND. 
STD ,| 


t GOV. 


IN J. 
STD, 





















UA(c) 


.08 


-.27 


.10 


-.00 


.14 


-.42 


-.70 


-.59 


BC(d) 


.31 


.10 


.46 


.50 


.05 


-.17 


, 18 


-.22 


UO(d) 


-.19 


.06 


.27 


.11 


-.10 


.26 


1.56 


.24 


UF(b) 


















OVERALL 


.07 


-.04 


.28 


.20 


.03 


-.11 


.35 


-.19 


DIFFERENCE 


.11 




.08 




'14 




.54 





( 

the measure and a perfectly reliable measure (i.e.^ error-free). The 
ret ^ffect of the procedure is an increase in the slope of the betas — 
i.e., greater covariable adjustment. However, since our data are aggre- 
gated to class room- level variables, reliability estimates of all covari- 
ables are such that this correction proce»Jure would have virtually no 
net effect. All covariables for pupil and parent analyses have estimated 
reliabilities (where estimable) in excess of J 95 , and currently, the 
reliability of teacher covariables is unestimable and thus must be assumed, 
to be error-free. 

The second procedure, using disjfrict- or project-level subgroup 
means to deviate cell means for covariable adjustments, does produce dif- 
ferent absolute cell value.-s compared with deviation of cell means from 
grand means. However, our goal is to estimate and to interpret relative 
FT/NFT differences, and these relative cell estimates are identical for 
both procedures. Since computer methods exist for conventional (grand 
mean) adjustments, we chose to follow this procedure. 

The actual results of overall regression covariables on the 
outcome variables are tabulated and summarized in Annex B. These tables 
are prepared for the project- level analyses and present summary and re- 
gression statistics for each major group of analyses. 
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Annex B 

PSYCHOVIETRIC AND REGRESSION DATA 



Annex B 

PSYCHOMETRIC AND REGRESSION DATA 



This annex consists of two parts. The first part presents the 
result's of reliability analyses of the pupil outcome measures. The second 
part contains summary and regression statistics obtained from the sepa- 
rate ANCOVAS performed on the proj ec1> leve 1 data. . 



Part 1: Psychome tric Data — Item Analyses and Reliability 
Data for Pupil Outcome Measures 

The item analyses and relevant statistical information on the 1971 
pupil outcome measures were compiled separately for each cohort group 
evaluated in this report. Both Follow Through and Non-Follow Through 
scores were pooled in computing item and test statistics for outcome 
variables for each cohort by grade streams. 

Table B-1 summarizes the principal statistical results of the reli- 
ability analyses. Included in the table are mean scores, standard de-- 
viations, coefficient alpha reliability estimates, standard errors of 
measurement, indices of skew and kurtosis, and the number of cases. In 
all, nearly 14,000 pupils contributed to these data: over 7,500 in Co- 
hort I, Kindergarten (CI-K), 3,200 in Cohort I, Entering First (CI-E) , 
2,000 in Cohort II, Kindergarten (CII-K) , and 1,000 m Cohort II, Enter- 
ing yirst (CII-E). Measures included were the \VRAT , achievementv*, and 
disaggregated component variables. The affect measure was not includer* 
in this analysis. 

Inspection of Table B-1 reveals that remarkably high reliability 
estiijiates are obtained for these measures. In particular; • reliabi lity 
of the achievement measure ranges from a low of .964 to a high of .986. 
WRAT varies from .934 to .973. The quantitative, reading, and language 
measures range from .762 to .982, with a median value of .92. The cog- 
-^Ri'tive processes measure, which was omitted from the CI-E battery, dis- 
played the poorest measurement properties, with reliability varying from 
.580 to .760. 
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TABLE B-1 

TEST STATISTICS AND RELIABILITY DATA FOR DEPENDENT 
VARIABLES AND COMPONENTS IN THE FOLLOW 
THROUGH 'cognitive TEST BATTERY 



CI-K 



Sample Base 



CI-EF 



CII-K 



CII-EF 



ACHIEVEMENT 

MEAN 

S.D. 

RELIAB; 

STD ERROR 

SKEWNESS 

KURTOSIS 

N 



119. 9 
32.07 
. 972 
7.53 
-.215 
-.367 
7427 



141.0 
46.02 
. 986 
7.67 
-.514 
-.679 
3237 



100.8 
25.00 
.964 
6.65 
-.551 
-.241 
1937 



145.7 
29.09 
.969 
7.19 
-..742 
.867 

778 



WRAT 



MEAN 

S.D. 

RELIAB. 

STD ERROR 

SKEWNESS 

KURTOSIS 

N 



69.2 
16.05 
.955 
4.76 
-.059 
-.288 
7587 



110.7 
18.73 
. 973 
4. 32 
- .803 
-.161 
3237 



44.8 
13.59 
. 934 
4.85 
-.137 
-.155 
1994 



63.7 
15.04 
. 952 
4.60 
-.096 
-.209 
793 



QUANTITATIVE 

MEAN 

S.D. 

RELIAB. 

STO ERROR 

SKEWNESS 

KURTOSIS 

N 



38. 5 
10. 49 
. 928 
3. 91 
-.558 
-.2^1 
7427 



39.2 
12.78 
. 951 
3. 95 
-.803 
-.161 
3237 



27.7 
7,^i7 
.880 
' 3.64 
-.672 
.068 

1941 



44. 3 
8.85 

. 907 
3.72 
-.926 
1.028 
778 
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TABLE B-1 (CONCLUDED) 



Sample Base 



CI-K 



CI-EF 



CII-K 



CII-EF 



READING 

MEAN 
S.D. 

RELIAB. 
STD ERROR 
SKEWNESS 
KURTOSIS 
N 



52.7 
14. 98 
. 948 
4.76 
-.135 
-.257 
7427 



78. 
28. 40 
. 982 
5. 36 
-.471 
-. 934 
3237 



51.2 
13.97 
. 950 
4.36 
-.580 
-.334 
1956 



69.3 
14.37 
. 948 
4.57 
-.797 
1.157 
'782 



LANGUAGE 

MEAN 
S.D. 

RELIAB. 
STD ERROR 
SKEWNESS 
KURTOSIS 
N 



21.7 
8.21 

.888 
3.78 

. 125 
-.505 
7427 



23.6 
8.19 

. 916 
3.28 
-. 167 
-.533 



3243 



14.6 
3.55 

.762 
2.30 
-.460 

.638 

1982 



22.6 
6.33 

.851 
3.32 
-.071 
-.262 
778 



COG. PROCESSES 

MEAN 

S.D. 

RELIAB. 

STD ERROR 

SKEWNESS 

KURTOSIS 

N 



7.0 
1.72 

.580 
1.40 
-.703 

. 227 

7^63 



7.0 
2.67 

.760 
1.73 
-.485 
-.550 
2008 



9. 
2. 

1. 
-1. 

1^ 
778 



4 

33 

724 

61 

153 

103 



Table B-2 presents the detailed results of the item analysis for the 
cor«tents of the 1971 Follow Through Pupil Test Battery. These results 
display the item difficulty (percent passed) and variance for the test 
samples on each of the items contained in the battery. Items are arranged 
in terms of the major achievement components (i,e,, quantitative, reading, 
language, and cognitive processes), and thus, this table serves to define 
operationally the variables, as well as to display item statistics; also 
included is the booklet source of the item. Since not all items were given 
tc all pupils, the patter of administration is also noted in this table. 

Of particular interest are the apparent scalogram properties in terms 
of item difficulties as noted in this table. For example, increasing 
difficulties can be noted to correspond to item sequences and to grade 
levels (or cohonts) for the ar i thmeti c , reading, and spelling sections 
of the WRAT, These item properties correspond wall to those described 
by the authors (Jastak and Jastak, 1965) and indicate this test has desir- 
able measurement properties. With few exceptions, other groups of items 
produced very uniform high or low difficulty indices, suggesting the test 
might profitably be shortened in these areas. The notable exceptions are 
for the MAT and SAT items in language and reading. Also, the letter dis- 
crimination items did not generally differentiate performance, which sug- 
gests questionable utility. 

In sum, the data presented in this table are considered particularly 
useful for subsequent planning and test selection. Also, although the 
overall reliability is nuite high, there does not appear to be any strik- 
ing evidence that this lS because of items other than those in the WRAT. 
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Part 2: Summar y S tati st ics and ANCQVA Regression Data 

This part of Annex B presents summary tables of regression analyses 
of outcome measures on control variables. Tables B- 3 to B-18 are summar- 
ized from each of the 12 independent ANCOVAS conducted on the project-- 
level data. The covariable regression data from each analysis (p^pil, 
parent, teacher for CI-K, CI-EF, CII-K, and CII-EF) are summarized in a 
separate table (two tables are required to display pupil results — eight 
variables) . 

The entries in a given table show the sample size (N) , the "error" 
degrees of freedom (residual df ) , descriptive statistics* (mean and stan- 
dard deviation) for each covariable included in the analysis, and regres- 
sion statistics (zero order correlation coefficient, or Tq] raw regression 
coefficient; standardized regression coefficient, or beta values, and the 
standard errrrs of the raw regression coefficients) fOr each dependent 
variable on the covariables. Finally, summary statistics showing the 
mean and covariable of each dependent variable (both before and after 
regression on covariables) and the variance explained by the covariables 
(r2) are presented. 

Inspection of these tables reveals the highly variable contribution 
of covariance analysis to error reduction. In some instances, the high 
variance reduction is due to problems of regression shrinkage (e.g.. 
Cohort II-E) and should be disregarded. Overall, the covariance regres- 
sions appear to have produced about a 50-percent reduction in error var- 
iance OP pupil measures. The regression effects tend to be somewhat less^ 
pronounced but highly variable for parent and teacher analyses, which sug- 
gests that better covariables could be selected for future analyses. 



* 

The means and standard deviation*^ of the baseline test measures are 
presented in transfomed scale. This scale has the parameters of mean = 
zero, and standard deviation = reliability of measure. 
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FOLLOW THROUGH PROGRAM SPONSORS 



AFRAM PARENT IMPLEMENTATION APPROACH 

Afram Associates, Inc. 
68-72 E. 131st Street 
Harlem, New York 10037 

Director: Preston Wilcox 



EDC OPEN EDUCATION F>>LLOW THROUGH PROGRAM 

\ 

Education Development Center 

55 Chapel Street 

Newton , Massachusetts 02160 

Di rector : George E . He in 



BANK STREET COLLEGE OF EDUCATION APPROACH 

Bank Street College of Education 

610 W. 112th Street 

New York, New York 10025 

Director: Elizabeth C. Gilkeson 



FLORIDA PARENT EDUCATION ftKDDEL 

University of" Florida 
513 Weil Hall 

Gainesville, Florida 32601 
Director: Ira J . Gordon 



BEHAVIOR ANALYSIS APPROACH 

Support and Development Center 

for Follow Through 
Department of Human Development 
University of Kansas 
Lawrence, Kansas 66044 

Director: Don Bushell, Jr. 

CALIFORNIA PROCESS MODEL 

California State Department of Education 
Division of Compensatory Education 
Bureau of Program Development ' 
721 Capitol Mall 
Sacramento, California 95814 

Director: James Jordan 

COGNITIVELY ORIENTED CURRICULUM MODEL 

High/Scope Educational Research Foundation 
125 North Huron Street 
Ypsilanti, Michigan 48197 

Director: David P. Weikart 

CULTURAL LINGUISTIC FOLLOW THROUGH APPROACH 

Center for Inner City Studies 
Northeastern Illinois University 
700 E. Oakwood Boulevard 
Chicago, Illinois 60653 

Dii'ectors: Nancy L. Arnez 
Clara Holton 

CULTURALLY DEMOCRATIC LEARNING ENVIRONMENTS 

University of California, Riverside 
2316 Library South 
Rr.verside, California 92502 

Director: Manuel Ramirez III 



HAMPTON INSTITUTE NON-GRADED FOLLOW 
THROUGH MODEL 

Hampton Institute 
Hampton, Virginia 23368 

Director: Mary T. Christian 

HOME-SCHOOL PARTNERSHIP: A MOTIVATIONAL 
APPROACH 

Clark College 

240 Chestnut Street, S.W. 

Atlanta, Georgia 30314 

Director: Edward E. Johnson 



INDIVIDUALIZED EARLY LEARNING PROGRAM 
University of Pittsburgh 

Learning Research and Development Center 

Project Follow Tnrough 

G6 Social Science Building 

Pittsburgh, Pennsylvania 15213 

Directors: Lauren Resnick 
Warren Shepler 

INTERDEPENDENT LEARNING MODEL 

Follow Through 

1700 Stewart^Avenue S.W. 

Atlanta, Georgia 30315 

Director: Frances Cox 

Follow Through 
Public School 76M 
220 West 121st Street 
New York, New York 10027 

Director: Altharanzo L. Thompson 
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FOLLOW THROUGH PROGRAM SPOKSORS (Concluded) 



LANGUAGE DEVELOPMENT (BILINGUAL) EDUCATION 
APPROACH 

Southwest Educational Development 

Laboratory (SEDL) 
Follow Through Model 
800 Brazos Street 
Austin, Texas 78701 

Director: Preston C. Kronkosky 

MATHEMAGNETIC ACTIVITIES PROGRAM 

University of Georgia Follow Through 
Psychology Department 
Athens, Georgia 30601 

Director: c. D. Smock 

THE NEW SCHOOI, APPROACH TO FOLLOW THROUGH 

University of North Dakota 
Center for Teaching and 
Learning 

Grand Forks, North Dakota 58201 
Director: Vito Perrone 

PARENT SUPPORTED APPLICATION OF THE BEHAVIOR 
ORIENTED PRESCRIPTIVE TEACHING APPROACH 

Georgia state University 

Department of Early Childhood Education 

33 Gilmer Street 

Atlanta, Georgia 30303 

Director: Walter L. Hodges 



RESPONSIVE EDUCATIONAL PROGRAM 

Far West Laboratory for Educational 

Research and Development 
1855 Folsom Street 
San Francisco, CA 94103 

Director: Denis Thorns 



RESPONSIVE E>r/IRONMENTS CORPORATION EARLY 
CHILDHOOD MODZL 

Responsive Environments Corporation 
200 Sylvan Avenue 

Englewood Cliffs, New Jersey 07632 
Director: Lorie Caudle 

ROLE-TRADE MODEL 

Western Behavioral Sciencer Institute 

1150 Silverado 

La Jolla, California 92037 

Director: Wayman J. Crow 

TUCSON EARLY EDUCATION MODEL 

Arizona Center for Early Childhood 

Educa tion 
1515 E, First Street 
University of Arizona 
Tucson, Arizona 85719 

Director: Joseph M. Fillerup 
UNIVERSITY OF OREGON ENG_EL MANN- BECKER MODEL 



University of Oregon 
College of Education 
Follow Through Project 
Eugene , Oregon 97403 



Directors: Siegfried Engelmann 
Wesley C. Becker 



